Python Business Intelligence Cookbook
图书信息
| 作者 | Robert Dempsey |
| 出版社 | Packt Publishing |
| ISBN | 9781785289668 |
| 出版时间 | 2015-12-22 |
| 字数 | 70.7万 |
| 分类 | 进口书,外文原版书,电脑,网络 |
读书简介
Leverage the computational power of Python with more than 60 recipes that arm you with the required skills to make informed business decisions About This Book Want to minimize risk and optimize profits of your businessLearn to create efficient analytical reports with ease using this highly practical, easy-to-follow guide Learn to apply Python for business intelligence tasks—preparing, exploring, analyzing, visualizing and reporting—in order to make more informed business decisions using data at hand Learn to explore and analyze business data, and build business intelligence dashboards with the help of various insightful recipes Who This Book Is For This book is intended for data analysts, managers, and executives with a basic knowledge of Python, who now want to use Python for their BI tasks. If you have a good knowledge and understanding of BI applications and have a “working” system in place, this book will enhance your toolbox. What You Will Learn Install Anaconda, MongoDB, and everything you need to get started with your data analysis Prepare data for analysis by querying cleaning and standardizing data Explore your data by creating a Pandas data frame from MongoDB Gain powerful insights, both statistical and predictive, to make informed business decisions Visualize your data by building dashboards and generating reports Create a complete data processing and business intelligence system In Detail The amount of data produced by businesses and devices is going nowhere but up. In this scenario, the major advantage of Python is that it's a general-purpose language and gives you a lot of flexibility in data structures. Python is an excellent tool for more specialized analysis tasks, and is powered with related libraries to process data streams, to visualize datasets, and to carry out scientific calculations. Using Python for business intelligence (BI) can help you solve tricky problems in one go. Rather than spending day after day scouring Internet forums for “how-to” information, here you’ll find more than 60 recipes that take you through the entire process of creating actionable intelligence from your raw data, no matter what shape or form it’s in. Within the first 30 minutes of opening this book, you’ll learn how to use the latest in Python and NoSQL databases to glean insights from data just waiting to be exploited.< We’ll begin with a quick-fire introduction to Python for BI and show you what problems Python solves. From there, we move on to working with a predefined data set to extract data as per business requirements, using the Pandas library and MongoDB as our storage engine. Next, we will analyze data and perform transformations for BI with Python. Through this, you will gather insightful data that will help you make informed decisions for your business. The final part of the book will show you the most important task of BI—visualizing data by building stunning dashboards using Matplotlib, PyTables, and iPython Notebook. Style and approach This is a step-by-step guide to help you prepare, explore, analyze and report data, written in a conversational tone to make it easy to grasp. Whether you’re new to BI or are looking for a better way to work, you’ll find the knowledge and skills here to get your job done efficiently.
目录
Python Business Intelligence Cookbook
Table of Contents
Python Business Intelligence Cookbook
Credits
About the Author
About the Reviewer
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Sections
Getting ready
How to do it…
How it works…
There's more…
See also
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Getting Set Up to Gain Business Intelligence
Introduction
Installing Anaconda
Getting ready
How to do it…
Mac OS X 10.10.4
Windows 8.1
Linux Ubuntu server 14.04.2 LTS
How it works…
Learn about the Python libraries we will be using
Installing, configuring, and running MongoDB
Getting ready
How to do it…
Mac OS X
Windows
Linux
How it works…
Installing Rodeo
Getting ready
How to do it…
How it works…
Starting Rodeo
Getting ready
How to do it…
Installing Robomongo
Getting ready
How to do it…
Mac OS X
Windows
Using Robomongo to query MongoDB
Getting ready
How to do it…
Downloading the UK Road Safety Data dataset
How to do it…
How it works…
Why we are using this dataset
2. Making Your Data All It Can Be
Importing a CSV file into MongoDB
Getting ready
How to do it…
How it works…
There's more…
Importing an Excel file into MongoDB
Getting ready
How to do it…
How it works…
Importing a JSON file into MongoDB
Getting ready
How to do it…
Importing a plain text file into MongoDB
How to do it…
How it works…
Retrieving a single record using PyMongo
Getting ready
How to do it…
How it works…
Retrieving multiple records using PyMongo
Getting ready
How to do it…
How it works…
Inserting a single record using PyMongo
Getting ready
How to do it…
How it works…
Inserting multiple records using PyMongo
Getting ready
How to do it…
How it works…
Updating a single record using PyMongo
Getting ready
How to do it…
How it works…
Updating multiple records using PyMongo
Getting ready
How to do it…
How it works…
Deleting a single record using pymongo
Getting ready
How to do it…
How it works…
Deleting multiple records using PyMongo
Getting ready
How to do it…
How it works…
Importing a CSV file into a Pandas DataFrame
Getting ready
How to do it…
How it works…
There's more…
Renaming column headers in Pandas
Getting ready
How to do it…
How it works…
Filling in missing values in Pandas
Getting ready
How to do it…
How it works…
Removing punctuation in Pandas
Getting ready
How to do it…
How it works…
Removing whitespace in Pandas
Getting ready
How to do it…
How it works…
Removing any string from within a string in Pandas
Getting ready
How to do it…
How it works…
Merging two datasets in Pandas
Getting ready
How to do it…
How it works…
Titlecasing anything
Getting ready
How to do it…
How it works…
Uppercasing a column in Pandas
Getting ready
How to do it…
How it works…
Updating values in place in Pandas
Getting ready
How to do it…
How it works…
Standardizing a Social Security number in Pandas
Getting ready
How to do it…
How it works…
Standardizing dates in Pandas
Getting ready
How to do it…
How it works…
Converting categories to numbers in Pandas for a speed boost
Getting ready
How to do it…
How it works…
3. Learning What Your Data Truly Holds
Creating a Pandas DataFrame from a MongoDB query
Getting ready
How to do it…
How it works…
Creating a Pandas DataFrame from a CSV file
How to do it…
How it works…
Creating a Pandas DataFrame from an Excel file
How to do it…
How it works…
Creating a Pandas DataFrame from a JSON file
How to do it…
How it works…
Creating a data quality report
Getting ready
How to do it…
How it works…
Generating summary statistics for the entire dataset
How to do it…
How it works…
Generating summary statistics for object type columns
How to do it…
How it works…
Getting the mode of the entire dataset
How to do it…
How it works…
Generating summary statistics for a single column
How to do it…
How it works…
Getting a count of unique values for a single column
How to do it…
How it works…
Additional Arguments
Getting the minimum and maximum values of a single column
How to do it…
How it works…
Generating quantiles for a single column
How to do it…
How it works…
Getting the mean, median, mode, and range for a single column
How to do it…
How it works…
Generating a frequency table for a single column by date
Getting ready
How to do it…
How it works…
Generating a frequency table of two variables
Getting ready
How to do it…
How it works…
Creating a histogram for a column
Getting ready
How to do it…
How it works…
Plotting the data as a probability distribution
How to do it…
How it works…
Plotting a cumulative distribution function
How to do it…
How it works…
Showing the histogram as a stepped line
How to do it…
How it works…
Plotting two sets of values in a probability distribution
How to do it…
How it works…
Creating a customized box plot with whiskers
How to do it…
How it works…
Creating a basic bar chart for a single column over time
How to do it…
How it works…
4. Performing Data Analysis for Non Data Analysts
Performing a distribution analysis
How to do it…
How it works…
Performing categorical variable analysis
How to do it…
How it works…
Performing a linear regression
How to do it…
How it works…
Performing a time-series analysis
How to do it…
How it works…
Performing outlier detection
How to do it…
How it works…
Creating a predictive model using logistic regression
How to do it…
How it works…
Creating a predictive model using a random forest
How to do it…
How it works…
Creating a predictive model using Support Vector Machines
How to do it…
How it works…
Saving a predictive model for production use
Getting Ready
How to do it…
How it works…
5. Building a Business Intelligence Dashboard Quickly
Creating reports in Excel directly from a Pandas DataFrame
How to do it…
How it works…
Creating customizable Excel reports using XlsxWriter
How to do it…
How it works…
Building a shareable dashboard using IPython Notebook and matplotlib
Getting Set Up…
How to do it…
How it works…
Exporting an IPython Notebook Dashboard to HTML
Getting Ready…
How to do it…
How it works…
See Also…
Exporting an IPython Notebook Dashboard to PDF
Getting Ready…
How to do it...
Method one…
Method 2…
Exporting an IPython Notebook Dashboard to an HTML slideshow
How to do it…
How it works…
Building your First Flask application in 10 minutes or less
Getting Set Up…
How to do it…
How it works…
See Also..
Creating and saving your plots for your Flask BI dashboard
How to do it…
How it works…
Building a business intelligence dashboard in Flask
How to do it…
How it works…
Index
- 四季筵(清辰)
- 人机对话系统(曹均阔,陈国莲)
- 永无止尽的狂热:三岛由纪夫(杨照)
- 《新东方英语》中学生2014年12月号(《新东方英语》编辑部)
- 大学生初涉职场一本通(第2版)(杨添天 主编)
- 图说天下学生版 超级兵器传奇 世界王牌武器陆海空大阅兵(套装共3册)(试读本)(薛金冉 编著)
- 区块链编程((美)吉米·宋(Jimmy Song))
- 小红书爆款规律(吕白)
