- 博客(0)
- 资源 (4)
空空如也
rdtsc_intel_instruction
INTEL X86 rdtsc指令编程指南
..........................................................
2018-07-05
How Java's Floating-Point Hurts Everyone Everywhere
分析JAVA的浮点支持的不足.
..............................................................................................................................................................
2018-07-05
LSDDLNetworks
Recentworkinunsupervisedfeaturelearninganddeeplearninghasshownthatbeingabletotrainlargemodelscandramaticallyimproveperformance. Inthispaper, we consider the problem of training a deep network with billions of parameters using tens of thousands of CPU cores. We have developed a software framework calledDistBelief thatcanutilizecomputingclusterswiththousandsofmachinesto train large models. Within this framework, we have developed two algorithms for large-scale distributed training: (i) Downpour SGD, an asynchronous stochastic gradient descent procedure supporting a large number of model replicas, and (ii) Sandblaster, a framework that supports a variety of distributed batch optimization procedures, including a distributed implementation of L-BFGS. Downpour SGD andSandblasterL-BFGSbothincreasethescaleandspeedofdeepnetworktraining. Wehavesuccessfullyusedoursystemtotrainadeepnetwork30xlargerthan previously reported in the literature, and achieves state-of-the-art performance on ImageNet, a visual object recognition task with 16 million images and 21k categories. We show that these same techniques dramatically accelerate the training of a more modestly- sized deep network for a commercial speech recognition service. Although we focus on and report performance of these methods as applied to training large neural networks, the underlying algorithms are applicable to any gradient-based machine learning algorithm.
2018-06-07
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人