自定义博客皮肤VIP专享

*博客头图:

格式为PNG、JPG,宽度*高度大于1920*100像素,不超过2MB,主视觉建议放在右侧,请参照线上博客头图

请上传大于1920*100像素的图片!

博客底图:

图片格式为PNG、JPG,不超过1MB,可上下左右平铺至整个背景

栏目图:

图片格式为PNG、JPG,图片宽度*高度为300*38像素,不超过0.5MB

主标题颜色:

RGB颜色,例如:#AFAFAF

Hover:

RGB颜色,例如:#AFAFAF

副标题颜色:

RGB颜色,例如:#AFAFAF

自定义博客皮肤

-+
  • 博客(1)
  • 资源 (23)
  • 收藏
  • 关注

原创 Spark原理要点

宽依赖与窄依赖 宽窄依赖是用于描述父子RDD之间的继承特点。 在窄依赖下,父子RDD分区保持一对一关系。

2020-09-18 20:47:48 101

《CDN技术详解》高清扫描版带书签

讲了CDN的方方面面,对于从事这个领域的人而言有很高的参考价值

2018-05-23

Everybody Lies[人人说谎-大数据揭示真实的我们]英文版

版权归作者所有,任何形式转载请联系作者。 作者:asterisk(来自豆瓣) 来源:https://book.douban.com/review/8744398/ 这本书也是讲大数据的,和同类的书相比,也基本没有太多的新货,基本断断续续翻着翻着就看完了,不过还是稍微总结一下: 1. 大数据和common wisdom,common wisdom相对来讲都是小数据,虽然通常有效,但是存在明显的bias,因为人往往会夸大自己的经验,低估自己没有经历过的事情的概率。常识有效,但是要谨防各种各样的bias 2.大数据时代,相关关系可以解决很多问题,这个观点早已不新鲜,相比于专家经验依靠领域知识,相关关系可以在不知道因果的情况下达到一定的效果,当然缺少归因还是可能存在bias或者错误的风险 3.一个人的行为相比于语言更真实,面对搜索引擎/app无意中留下的痕迹比面对调研的回答更可靠,一方面,按照心理学和神经科学的研究,人的意识分为潜意识和意识,而有时候潜意识做出的行为因为不涉及到复杂的决策和理解的过程,也往往反映的是真实的意图,而有意识的行为在某些情形下,反而是有bias的,例如道德,面子的约束往往会让人倾向于美化自己或者是自己相关人的情况,另一方面,人们留下的痕迹数据也更加的多样化和丰富,极大地提升了我们对于人的理解能力 4. 大数据时代带来的还有就是A/B test,可以说a/btest是目前互联网公司万能的手段,大到一个产品的定位,小到一个图标的选择,都可以通过a/btest来拿到结果,这是大数据时代独有的方法论,并且已经被证明行之有效,虽然有滥用的风险,但整体可以说非常成功

2018-05-14

MariaDB Crash Course【MariaDB(MySQL开源社区版)数据库快速入门】英文版

MariaDB is an offshoot of MySQL, one of the most popular database management systems in the world. From small development projects to some of the best-known and most prestigious sites on the Web, MySQL has proven itself to be a solid, reli- able, fast, and trusted solution to all sorts of data storage needs.

2018-05-14

区块链技术指南

“目前在市场上的区块链书籍大致分为两类:一类是以梅兰妮·斯万(Melanie Swan)的《区块链:新经济蓝图及导读》为代表的,谈区块链对整个宏观层面所带来的革命性影响的战略性书籍;一类是以安德鲁·安东普洛斯(Andreas M.Antonpulos)的《精通比特币》,以及普林斯顿大学以阿文·拿瑞延南”“(ArvindNarayanan)为首编著的《比特币和密码学技术》为代表的专注于比特币的技术性书籍。这些书籍满足了目前市场上一部分对区块链在行业中的应用有兴趣的偏业务的人士,以及对比特币技术有兴趣的偏技术的人士的需求。 在这两类书籍所覆盖的市场中,其实还有一个很大的空白。我们发现,在对整个区块链架构(包括区块链1.0、2.0和3.0)进行系统性剖析,包括对其中关键技术(密码学、共识算法)等进行系统性论述,对不同的区块链架构形式(联盟链、公共链、私有链、侧链、多链、互联链等)进行系统性介绍的书好像还没有。而这样的书对理解、普及区块链技术,推动区块链应用落地可能会有所帮助。因此,与其等待这样的书籍出现,不如自己行动,为区块链技术的推广尽绵薄之力。笔者也就自不量力,把可能被[…]”

2018-05-14

Advanced.Analytics.with.Spark【Spark高级数据分析】

I don’t like to think I have many regrets, but it’s hard to believe anything good came out of a particular lazy moment in 2011 when I was looking into how to best distrib‐ ute tough discrete optimization problems over clusters of computers. My advisor explained this newfangled Spark thing he had heard of, and I basically wrote off the concept as too good to be true and promptly got back to writing my undergrad thesis in MapReduce. Since then, Spark and I have both matured a bit, but one of us has seen a meteoric rise that’s nearly impossible to avoid making “ignite” puns about. Cut to two years later, and it has become crystal clear that Spark is something worth pay‐ ing attention to. Spark’s long lineage of predecessors, running from MPI to MapReduce, makes it pos‐ sible to write programs that take advantage of massive resources while abstracting away the nitty-gritty details of distributed systems. As much as data processing needs have motivated the development of these frameworks, in a way the field of big data has become so related to these frameworks that its scope is defined by what these frameworks can handle. Spark’s promise is to take this a little further—to make writ‐ ing distributed programs feel like writing regular programs. Spark will be great at giving ETL pipelines huge boosts in performance and easing some of the pain that feeds the MapReduce programmer’s daily chant of despair (“why? whyyyyy?”) to the Hadoop gods. But the exciting thing for me about it has always been what it opens up for complex analytics. With a paradigm that supports iterative algorithms and interactive exploration, Spark is finally an open source framework that allows a data scientist to be productive with large data sets. I think the best way to teach data science is by example. To that end, my colleagues and I have put together a book of applications, trying to touch on the interactions between the most common algorithms, data sets, and design patterns in large-scale analytics. This book isn’t meant to be read cover to cover. Page to a chapter that looks like something you’re trying to accomplish, or that simply ignites your interest.

2018-05-13

A survey of security and privacy in connected vehicles 车联网安全综述论文

Electronic Control Units (ECUs) of a vehicle control the behavior of its devices–e.g., break and engine. They communicate through the in-vehicle network. Vehicles communicate with other vehicles and Road Side Units (RSUs) through Vehicular Ad-hoc Networks (VANets), with personal devices through Wireless Per- sonal Area Networks (WPANs), and with service center systems through cellular networks. A vehicle that uses an external network, in addition to the in-vehicle network, is called connected vehicle. A connected vehicle could benefit from smart mobility applications: applications that use information generated by vehicles, e.g., cooperative adaptive cruise control. However, connecting in-vehicle network, VANet, WPAN, and cellular network in- creases the count and complexity of threats to vehicles, which makes developing security and privacy solutions for connected vehicles more challenging. In this work we provide a taxonomy for security and privacy aspects of connected vehicle. The aspects are: security of communication links, data validity, security of devices, identity and liability, access control, and privacy of drivers and vehicles. We use the taxonomy to classify the main threats to connected vehicles, and existing solutions that address the threats. We also report about the (only) approach for verifying security and privacy architecture of connected vehicle that we found in the literature. The taxonomy and survey could be used by security architects to develop security solutions for smart mobility applications.

2018-05-13

中科大自动化系历年考研题

中科大自动化系的考研题哦,绝对珍贵的资料,包含答案

2010-03-19

最小二乘法MATLAB源码

最经典的系统辨识方法哦,用MATLAB写的,包含基本LS,GLS, ELS等5种LS变种

2010-03-19

filelock绿色文件加密软件

我昨天刚开始用的加密软件,可以一次加密多个文件。如果想加密文件夹的话可以先将其打包。用过后感觉不错的。软件不需安装,但需要点击“绿化”。安装时请查看解压目录上的说明。加密器的初始密码是123456,绿化后请修改密码。

2010-03-11

Mathematica v6.0 注册机

这个注册机我用过的,在我的电脑上能用的。当时选修数学实验的时候需要用Mathematica,正版的软件我们穷学生是买不起的,所以只能下载一个注册机啦

2010-03-11

labview 8.2 注册机

用于安装LabVIEW8.2,本来要将LabVIEW本身整个传上来的,可惜我没那么大的空间。注册机是能注册我那个版本的,至于能够注册其他版本,有待考证

2010-03-09

金山词霸破解补丁,免费用词霸专业版

针对金山词霸2007。使用方法:安装词霸后,先不要运行,将破解补丁中四个文件拷贝到安装目录中,覆盖原先的4个文件,即可完成。

2010-03-09

Acrobat Professional 8.1.2 注册机

我需要看很多PDF,网上免费的Adobe Reader我是不满意的,因为不能做标记.于是下载了Acrobat Professional, 用于编辑、制作PDF文件.但试用版的pdf老是提醒我注册,而且只有30天期限,太讨厌了,于是这个注册机就派上了用场

2010-03-09

服务器调度NS2仿真程序源码

本程序基于NS2架构,使用各种调度策略对服务器集群进行管理.NS2是C++开源软件,广泛用于各种网络仿真研究工作中

2010-03-09

用VC写的文本加密编辑器器源码

我用MFC制作了一个文本加密/解密可执行文件,适用于对中英文文本的加密、解密。加密方法类似于凯撒密码,但基于ASCII值而非字母顺序,能够同时对大小写字母、数字和标点符号进行加密。

2010-03-08

vim 7.2 windows版

可在windows环境下安装运行的vim.功能远远强于记事本、写字板.vi是unix两大编辑器之一,具有命令模式、编辑模式等多种状态.vim是vi的升级版.为什么要在windows下也用VIM,原因之一是因为它的强大;原因之二嘛,对于不断在linux和windows间切换的用户,可以在任何时候下都练习使用vim

2010-03-08

Peopleware电子书txt

Bruce Eckel在其编程思想系列丛书的附录中收录了他所喜欢的书,《Peopleware》就是其中一本.Peopleware,顾名思义,就是与硬件、软件相对应的“人件”,讲述如果通过协调软件开发团队的关系来提高效益。程序员混世靠的不仅是技术,还有处世能力,因此,本书与其说是技术之书,不如说是人生之书

2010-03-08

《程序员羊皮卷》电子书txt版

本书介绍一个IT行业学生和从业人员面临生活和工作各种问题的建议,涉及考研,就业,工作环境,人际关系处理。尤其对程序员如何应对“猎头”(为某IT企业挖人的公司)作了详细剖析,是程序员及相关人士的必备之书。

2010-03-08

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示
确定要删除当前文章?
取消 删除