• 博客(0)
  • 资源 (4)

空空如也

rdtsc_intel_instruction

INTEL X86 rdtsc指令编程指南 ..........................................................

2018-07-05

How Java's Floating-Point Hurts Everyone Everywhere

分析JAVA的浮点支持的不足. ..............................................................................................................................................................

2018-07-05

LSDDLNetworks

Recentworkinunsupervisedfeaturelearninganddeeplearninghasshownthatbeingabletotrainlargemodelscandramaticallyimproveperformance. Inthispaper, we consider the problem of training a deep network with billions of parameters using tens of thousands of CPU cores. We have developed a software framework calledDistBelief thatcanutilizecomputingclusterswiththousandsofmachinesto train large models. Within this framework, we have developed two algorithms for large-scale distributed training: (i) Downpour SGD, an asynchronous stochastic gradient descent procedure supporting a large number of model replicas, and (ii) Sandblaster, a framework that supports a variety of distributed batch optimization procedures, including a distributed implementation of L-BFGS. Downpour SGD andSandblasterL-BFGSbothincreasethescaleandspeedofdeepnetworktraining. Wehavesuccessfullyusedoursystemtotrainadeepnetwork30xlargerthan previously reported in the literature, and achieves state-of-the-art performance on ImageNet, a visual object recognition task with 16 million images and 21k categories. We show that these same techniques dramatically accelerate the training of a more modestly- sized deep network for a commercial speech recognition service. Although we focus on and report performance of these methods as applied to training large neural networks, the underlying algorithms are applicable to any gradient-based machine learning algorithm.

2018-06-07

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示
确定要删除当前文章?
取消 删除