自定义博客皮肤VIP专享

*博客头图:

格式为PNG、JPG,宽度*高度大于1920*100像素,不超过2MB,主视觉建议放在右侧,请参照线上博客头图

请上传大于1920*100像素的图片!

博客底图:

图片格式为PNG、JPG,不超过1MB,可上下左右平铺至整个背景

栏目图:

图片格式为PNG、JPG,图片宽度*高度为300*38像素,不超过0.5MB

主标题颜色:

RGB颜色,例如:#AFAFAF

Hover:

RGB颜色,例如:#AFAFAF

副标题颜色:

RGB颜色,例如:#AFAFAF

自定义博客皮肤

-+
  • 博客(1)
  • 资源 (1)
  • 收藏
  • 关注

原创 bloom filter 关键点 记录

bloom filter 简述: 目标集合A,待测试集合B,对于B中的每一条记录,判断是否属于集合A。 首先对A中集合每条记录取特征值,然后保存所有特征值到集合C。 对B中的每一条记录,使用相同的方法取特征值,判断特征值是否存在于C中。 在Hadoop in Action书中给了个例子。 对其中关键点进行说明: 获取特征值的说明: 使用md5算法,获取特征值, MessageDig

2015-10-10 11:06:14 134

elasticsearch 书籍 mastering elasticsearch

Welcome to the world of ElasticSearch and to the Mastering ElasticSearch book. While reading the book you'll be taken through different topics, all connected to ElasticSearch. We will start with the introduction to Apache Lucene and ElasticSearch, because even if you are familiar with it, it is crucial to have the background in order to fully understand what is going on when you form a cluster, send a document for indexing, or make a query. You will learn how Apache Lucene scoring works, how to influence it, and how to tell ElasticSearch to choose different scoring algorithms. The book will show you what query rewriting is and why it happens. Apart from that, you'll see how to change your queries to leverage ElasticSearch caching capabilities and make maximum use of it. After that we will focus on index control. We will learn the way to change how index fields are written, by using different posting formats. We will discuss segments merging, why it is important, and how to adjust it when there is a need. We'll take a deeper look at shard allocation mechanism and routing, and finally we'll learn what to do when data and query number grows. The book can't omit garbage collector description—how it works and where to start and when you need to tune its behavior. In addition to that, it covers functionalities that allow us to troubleshoot ElasticSearch, such as describing how segments merging works, how to see what ElasticSearch does beneath its high-level interface, and how to limit the I/O operations. But the book doesn't only pay attention to low-level aspects of ElasticSearch; it includes user search experience improvements tips, such as dealing with spelling mistakes, highly effective autocomplete feature, and a tutorial on how you can deal with query related improvements. In addition to this, the book you are holding will guide you through ElasticSearch Java API, showing how to use it, not only when it comes to CRUD operations but also when it comes to cluster and indices maintenance and manipulation. Finally, we will take a deep look at ElasticSearch extensions by developing a custom river plugin for data indexing and a custom analysis plugin for data analysis during query and index time.

2016-03-19

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示
确定要删除当前文章?
取消 删除