My recommendation book that applies to even for data analysts “Server/Infrastructure Engineer Training Book –log collection to visualization”

Hi, this is Yu ISHIKAWA at ATL.

One of the writers, @johtani at Elasticsearch Company gave me the book “ Server/Infrastructure Engineer Training Book –log collection to visualization.” This book is written for server/infrastructure engineers, but it is also good for data analysts to read through this. I think this book is really good for analysts to understand the difficulties and its solutions of technologies as real-time processing is needed and analyzed data is getting larger.

This book depicts Fluentd which can collect large-scaled log efficiently, Elasticsearch which has attracted a lot of attention as data-store and search/analysis server, and Kibana which is a visualization tool used with those. Elasticsearch is an OSS distributed full-text search engine server developed in Elasticsearch Company and so on, and it mainly uses a full-text search library called Apache Lucene. Kibana is a visualization front end which can make visualization interactively on a Web browser.

kibana

Cited by: http://www.elasticsearch.org/blog/kibana-3-0-0-ga-now-available/

This book shows the use-cases of Elasticsearch as a storage of data analysis results and of Kibana as its visualization tool, which interested me. Elasticsearch can cooperate the input/output of the Hadoop eco system. As Apache Hive is a compile tool commonly used in many companies, it is nice to be able to store its compiled results in Elasticsearch and to check the plot on Kibana.

How efficiently to have and visualize the data analysis results is one of the biggest problems. You can make it by yourself with MongoDB and Javascript, etc. but it costs to implement it. So, we want an OSS tool. In that respect, Kibana can be one of the choices.

As the book of course mentions on the tips of using Fluentd, Elasticsearch, and Kibana, I want not only engineer beginners but also data analysts to read this book.