• 博客(0)
  • 资源 (46)

空空如也

Airflow Documentation

Airflow is a platform to programmatically author, schedule and monitor workflows. Use airflow to author workflows as directed acyclic graphs DAGs) of tasks. The airflow scheduler executes your tasks on an array of workers while following the specified dependencies. Rich command line utilities make performing. complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed.

2018-10-31

Building (Better) Data Pipelines using Apache Airflow

Airflow is a platform to programmatically author, schedule and monitor workflows (a.k.a. DAGs or Directed Acyclic Graphs)

2018-10-31

An.Introduction.to.Machine.Learning.with.Application.in.R

The purpose of this document is to provide a conceptual introduction to statistical or machine learning (ML) techniques for those that might not normally be exposed to such approaches during their required typical statistical training 1 . Machine learning 2 can be described as a form of a statistics, often even utilizing well-known nad familiar techniques, that has bit of a different focus than traditional analytical practice in the social sciences and other disciplines. The key notion is that flexible, automatic approaches are used to detect patterns within the data, with a primary focus on making predictions on future data.

2018-10-28

Springer - R.A.Carmona - Statistical analysis of financial data in S-PLUS 豆瓣

This book grew out of lectures notes written for a one-semester junior statistics course offered to the undergraduate students majoring in the Department of Oper-ations Research and Financial Engineering at Princeton University. Tidbits of the history of this course will shed light on the nature and spirit of the book.

2018-10-28

Portfolio_Optimization_with_R_Rmetrics

This is a book about portfolio optimization from the perspective of compu- tational finance and financial engineering. Thus the main emphasis is to briefly introduce the concepts and to give the reader a set of powerful tools to solve the problems in the field of portfolio optimization.

2018-10-28

R Graphics

This practical guide provides more than 150 recipes to help you generate high-quality graphs quickly, without having to comb through all the details of R's graphing systems. Each recipe tackles a specific problem with a solution you can apply to your own project, and includes a discussion of how and why the recipe works. Most of the recipes use the ggplot2 package, a powerful and flexible way to make graphs in R. If you have a basic understanding of the R language, you're ready to get started.

2018-10-28

Social media Mining with R

Explore the social media APIs in R to capture data and tame itEmploy the machine learning capabilities of R to gain optimal business valueA hands-on guide with real-world examples to help you take advantage of the vast opportunities that come with social media data

2018-10-28

scikits-learn user guide

A basic introduction to writing extensions with scikit learn package.

2018-10-28

Visualize This

This book is example-driven and written to give you the skills to take a graphic from start to finish. You can read it cover to cover, or you can pick your spots if you already have a dataset or visualization in mind. The chap- ters are organized so that the examples are self-contained. If you’re new to data, the early chapters should be especially useful to you. They cover how to approach your data, what you should look for, and the tools avail- able to you. You can see where to find data and how to format and prepare it for visualization. After that, the visualization techniques are split by data type and what type of story you’re looking for.

2018-10-28

Writing R extensions

A basic introduction to writing extensions with R language.

2018-10-28

The R reference index

非常全面的对R语言api的文档目录。适合各种熟练程度的R语言使用者。

2018-10-28

R Data Import Export

a basic introduction to the import and export functionality of R.

2018-10-28

Python高级编程

《Python高级编程》通过大量的实例,介绍了Python语言的最佳实践和敏捷开发方法,并涉及整个软件生命周期的高级主题,诸如持续集成、版本控制系统、包的发行和分发、开发模式、文档编写等。《Python高级编程》首先介绍如何设置最优的开发环境,然后以Python敏捷开发方法为线索,阐述如何将已被验证的面向对象原则应用到设计中。这些内容为开发人员和项目管理人员提供了整个软件工程中的许多高级概念以及专家级的建议,其中有些内容的意义甚至超出了Python语言本身。

2018-10-27

编码的奥秘

渴望交流是大多数人的天性。在本书中,“编码”通常指一种在人和机器之间进行信息转换的系统。换句话说、编码即是交流。有时我们将编码看得很神秘,其实大多数编码并非都是这样。大多数的编码都需要被很好地理解,因为它们是人类交流的基础。 在本书中,作者Charles Petzold用常见的对象和诸如布莱叶盲文、摩尔斯电码之类大家熟悉的语言系统,为那些曾经想知道计算机和其他智能机器内部“生命”奥秘的人们编排了一个生动的叙述。 本书由灵活的图解和生动的故事组成。沿着作者的这种介绍思路,通过本书的学习你将会发现你已经获得了一个理解今天的PC、数字多媒体和因特网的真实背景。无论你的技术水平怎样,本书都将会使你陶醉,并且很可能唤醒读者参与计算机事业。

2018-10-27

Stream Data Processing A Quality of Service Perspective

The systems used to process data streams and provide for the needs of stream-based applications are Data Stream Management Systems (DSMSs). This book presents a new paradigm to meet the needs of these applications, including a detailed discussion of the techniques proposed. Ii includes important aspects of a QoS-driven DSMS (Data Stream Management System) and introduces applications where a DSMS can be used and discusses needs beyond the stream processing model. It also discusses in detail the design and implementation of MavStream. This volume is primarily intended as a reference book for researchers and advanced-level students in computer science. It is also appropriate for practitioners in industry who are interested in developing applications.

2018-10-27

Mining of Massive Datasets

The book is based on Stanford Computer Science course CS246: Mining Massive Datasets (and CS345A: Data Mining).The book, like the course, is designed at the undergraduate computer science level with no formal prerequisites. To support deeper explorations, most of the chapters are supplemented with further reading references.

2018-10-27

Redis Cookbook

Redis is an open source, advanced key-value store. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets. This book will provide developers with problem and solutions in our useful cookbook style. This is example driven ebook.

2018-10-27

Professional NoSQL

A hands-on guide to leveraging NoSQL databases NoSQL databases are an efficient and powerful tool for storing and manipulating vast quantities of data. Most NoSQL databases scale well as data grows. In addition, they are often malleable and flexible enough to accommodate semi-structured and sparse data sets. This comprehensive hands-on guide presents fundamental concepts and practical solutions for getting you ready to use NoSQL databases.

2018-10-27

CouchDB.The.Definitive.Guide

This book introduces you to Apache CouchDB, a document-oriented database that offers a different way to model your data. CouchDB is a schema-free database, designed to work with applications that handle document-based information such as contacts, invoices, and receipts. In "CouchDB: The Definitive Guide", three of the core developers gently explain how to work with CouchDB, using clear and practical scenarios. Each chapter showcases key features, such as simple document CRUD (create, read, updated, delete), advanced MapReduce, and deployment tuning for performance and reliability. With this book, you will: understand the basics of document-based storage and manipulation; model data as self-contained JSON documents; manage basic document CRUD; handle evolving data naturally; query and aggregate data in CouchDB, using MapReduce views; replicate data between nodes; and, carry out deployment tuning for performance and reliability.

2018-10-27

MapReduce架构

MapReduce 是一个编程模型,也是一个处理和生成超大数据集的算法模型的相关实现。用户首先创建一个 Map 函数处理一个基于 key/value pair 的数据集合, 输出中间的基于 key/value pair 的数据集合;然后再创建一个 Reduce 函数用来 合并所有的具有相同中间 key 值的中间 value 值。现实世界中有很多满足上述处理模型的例子, 本论文将详细描述这个模型。

2018-10-26

building query compilers

The book tries to demystify query optimization and query optimizers. By means of the multi-lingual query optimizer BD II, the most important aspects of query optimizers and their implementation are discussed. We concentrate not only on the query optimizer core (Rewrite I, Plan Generator, Rewrite II) of the query compilation process but touch on all issues from parsing to code generation and quality assurance.

2018-10-26

planning for big data

A CIO's Handbook to the Changing Data Landscape. practical implementation.

2018-10-26

Principles of Distributed Database Systems

The material concentrates on fundamental theories as well as techniques and algorithms. The advent of the Internet and the World Wide Web, and, more recently, the emergence of cloud computing and streaming data applications, has forced a renewal of interest in distributed and parallel data management, while, at the same time, requiring a rethinking of some of the traditional techniques.

2018-10-26

sams teach yourself emacs in 24 hours

The focus in the book is on using Emacs with a graphical interface, either X Window or Microsoft Windows. Thus no time will be wasted on discussing how to make Emacs work when you have a monitor that displays only 25 lines with 80 characters each.

2018-10-26

the linux command line

Designed for the new command line user, this 544-page volume covers the same material as LinuxCommand.org but in much greater detail. In addition to the basics of command line use and shell scripting, The Linux Command Line includes chapters on many common programs used on the command line, as well as more advanced topics.

2018-10-26

高性能javascript编程

《高性能JavaScript》揭示的技术和策略能帮助你在开发过程中消除性能瓶颈。你将会了解如何提升各方面的性能,包括代码的加载、运行、DOM交互、页面生存周期等。

2018-10-26

java performance

本书涵盖了JVM调优(tuning), 测试(benchmarking)和剖析(profiling)的方方面面. 其中开篇第一章"策略, 步骤以及方法论"非常高屋建瓴的告诉你如何在开发过程中处理性能调优的问题.

2018-10-26

modern c++ design

The book makes use of and explores a C++ programming technique called template metaprogramming.

2018-10-26

机器学习实践指南:案例应用解析

本书是机器学习领域经典著作,智能计算专家多年经验结晶,以全新的角度诠释机器学习的算法理论,通过案例系统阐述机器学习的实践方法和应用技巧,指导读者轻松步入工程应用阶段。

2018-10-26

机器学习实战

本书通过精心排的实例切入日常工作任务摒弃学术化语言利用高效可复用的Python 代码阐释如何处理统计数据进行数据分析及可视化。读者可从中学到一些核心的机器学习算法并将其运用于某些策略性任务中如分类、预测及推荐等。

2018-10-26

Introduction to the Modeling and Analysis of Complex Systems

Introduction to the Modeling and Analysis of Complex Systems introduces students to mathematical/computational modeling and analysis developed in the emerging interdisciplinary field of Complex Systems Science.

2018-10-26

Parallel and distribution simulation systems

并行分布式系统模拟介绍。侧重理论推导与算法实现,非常实用的教材。

2018-10-26

Modelling and Simulation_ Exploring Dynamic System Behaviour

介绍通过建模模拟实现动态系统的教材,含大量代码实例

2018-10-26

Introduction to Modeling and Analysis of Stochastic Systems

随机系统分析与建模Modeling and Analysis of Stochastic Systems很全面

2018-10-25

golang web开发

go语言web开发技术指南。适合想学习golang并进行网络开发的同学。

2018-10-25

learning_spark_sql

在大数据工具spark下如何操作sql数据库以及高效执行。

2018-10-25

数据结构算法leetcode

来自leetcode的精选题目的多种解法详解,提高刷题效率。

2018-10-25

数据结构与算法leetcode

来自leetcode的精选题目的多种解法详解,提高刷题效率。

2018-10-25

leetcode solutions

非常好的算法面试题汇总,有详尽的分析,全部素材来自与leetcode。

2018-10-25

python game programming by example

利用python进行游戏开发的教程,很多实战开发例子,适合各层级开发者学习。

2018-10-25

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示
确定要删除当前文章?
取消 删除