This is a companion post for the Google-Tableau joint session at Google Cloud Next on March 9, 2:40 PM.
If you have a lot of data, just looking at it can be overwhelming.
With a visualization, our minds can pick out patterns. For example, here we see seasonal changes in temperatures.
What should you know about visualizing big data with Google Cloud?
- Summarize. Use the cloud to summarize the data. For example, run a query with Google BigQuery or Cloud SQL to filter and group data. Fetch a little more than you think you’ll need and use an analysis tool like Pandas to process the data locally.
- Explore. Get started with data visualization with Tableau, a spreadsheet program, or any of the many visualization and BI/Analytics partner tools.
You don’t have to be a programmer to build visualizations, but if you are, I recommend using an interactive notebook tool such as Jupyter notebooks (which supports many programming language kernels) or R notebooks. It’s really helpful to see graphics inline with the code that created them. - Share. Found a useful visualization? Build a dashboard to monitor it and share it with your co-workers. Share directly from Tableau or use a dashboard tool like re:dash or Data Studio.
Demos
- Google Sheets — Visualize the distribution of names.
- Cloud Datalab — Finding the median name with a Pareto chart.
- Cloud Datalab — Visualize weather data for Austin, TX.
- Jupyter Notebook — Visualize geographic data to pick a home close to transit and tacos.
- Tableau-BigQuery Best Practices Whitepaper
- Tableau — Visualizing NYC Taxi and Limousine data with Tableau.
References & Resources
Here are some of the tools used in the talk.
- Tableau
- Google BigQuery
- Cloud Datalab
- Google Cloud SQL
- Google Data Studio
- Google Cloud Platform Free Trial
Some of the Python tools and libraries used in the talk for visualization and analysis.
- Jupyter Notebooks (for inline code & graphics)
- Pandas (for filtering & analysis)
- Matplotlib (for general plotting)
- Folium (for visualization on maps)
- SQLAlchemy (for connecting to Cloud SQL)
- SciPy (for calculating Voronoi diagrams)