Makeover Monday Week 6 – Ann’s Take, Big Data Style

Week 6 is here – and it’s big data style!  We were kindly granted VIP access to an EXASOL playground to dive in (thinking about diving into one of Christopher’s bubble charts) and play.  Loads of Chicago taxi data was available for us to consume.

So let’s get started!  Design here is on point – it just flows.  Christopher has done a fantastic job of framing the scene with two bar charts and maps on either side encasing more visuals in the center.  If I were eye tracking – my eyes are initially drawn to the taxi and then quickly dart up to the nagging question “Are Chicagoans bad tippers?”  From there the bubble chart works in harmony with the shapes used for location points.  Choice of color works well too.  I love the white background, muted maps, and yellow emphasis to mirror the taxi color.

There’s a lot of data housed here, but no numbers immediately visible – really truly data visualized.  Christopher has accomplished in this design a completely interactive analytical tool requiring the end user to ingest and synthesize the data.  I like that I know it’s millions of records, but the final presentation doesn’t evoke massive and incomprehensible.

Opening up the dashboard and checking out what’s under the hood – I’m equally impressed.  Everything about this workbook has been organized for easy maintenance, understanding, improvements, and/or reconstruction.

Just look at the names of the sheets:

And the fields:

So clean and precise.  Just like the visualizations.  My brain thanks you.  There’s a famous saying that the more simple something is, the harder it was to design.  I think that’s very true of this visualization.  Knowing the data piece and how much was there, succinctly making a single symmetrically framed visualization forcing (pleasantly, passively forcing) the users to interact, indicates a high level of sophistication in my book.


Makeover Monday Week 5 – Ann’s Take on Christopher’s Viz

Before I dive into Christopher’s viz this week I wanted to stop and take an important pause.  It’s February and it is week 5 of #MakeoverMonday.  This means that we’ve been showing up consistently for 5 weeks.  Elated by this knowledge, I quickly went to Wikipedia to try and self-validate that the act of participation had cemented itself in our lives as a habit.  I was thrilled with what I saw – ‘average’ participants reach asymptotic levels of automaticity in 66 days.  Well – neither of us is average, so I’m going to go toward the minimum  in recorded range (18 days) and say that we’ve made it!

It’s been an interesting adventure so far, full of unexpected drama and challenges.  (The makings of any good hero tale).

Now to the data and viz!  Christopher commented and I quite agree, this week was simple . I think backlash has been significant throughout the community, and this data set could be perceived as an attempt to push through.  Unambiguous in methodology (seeing as we were only provided resultant aggregated numbers) and consistent time ranges (thanks Andy!).  This left us with a data table of 7 x 3, 21 data points (24 if we count field names).

Christopher took an approach that I had considered as well: dropping two pie charts that are clearly not designed for comparison and replacing them with a pyramid chart.  A pyramid chart shows two bar charts back to back and is a much better tool to compare overall data shape.  Essentially what Christopher is doing here is leveraging the power of the shape to guide the end user to visually compare the figure of the data on the left and right side.  And the figurative comparison here is of much help.  Two countries immediately jump out: Japan and Italy.  Both have significantly different proportions of employment growth share than overall employment.  Immediately begging “what’s going on with those two?”

Additionally – as is customary of Christopher, we’ve got the benefit of extracted text bites derived from the original accompanying article.  It provides a lexicon for the data shape we’re presented with.  I get additional value from these and it shapes what the important takeaway could be.

Although slightly ornamental in nature, I do appreciate the map integrated into the viz.  And I like even more that care has been given to add interaction to the viz.  This props up the map from ornamental to more functional and adds a good feedback/response mechanism to how I’m interacting.

A simple viz with added interactivity is a win win in my book.  And also demonstrating some of the best practice components which have been resonating throughout the community (providing data source, not stealing images, etc.) round out the purpose.  Was the goal achieved?  By Andy’s own tweet standards do we have a better f*ing chart?  The answer is yes.

Visit the full viz on Christopher’s Tableau Public here.

Makeover Monday Week 5…Christopher’s Take on Ann’s Viz

Well I’m back…as much as I talked about competition last week Ann has certainly been a model of consistency and graciously accepted my apology for slacking off last week…of course she is always happy to take the tiara and be crowned Viz Queen!

So here we are, 5 weeks down and still at 100% a piece!  This week’s data set presented quite a challenge in that we were only given 3 data points, no time series, and only a handful of countries.  Well 2 weeks in a row all I can say is: color me IMPRESSED!

In the data visualization/Tableau world pie charts are anathema:

See –

and the Godfather of all things viz, Stephen Few,

So the only task…stay away from the PIE!  Like a diet…I digress.

Ann definitely surprised me with the scatter plot.  Quickly we can derive value from this somewhat ambiguous data set.  The reference line easily tells me exactly how I should interpret the relationship between the employment share and the net employment growth share.

Visually it is simple and elegant.  I’m not a big fan of grid lines but in this case they work as a contrast to the beige background.  Also I’m assuming the color codes were for continent/region…an interesting and purposeful take on a limited data set.

Overall I would recommend the UK Business Insider use THIS viz instead of the 2 ATROCIOUS pie charts!

Thank you Ann for your commitment to our project, to making me a better vizzer, and for maintaining your regal demure in the midst of it all!

Makeover Monday Week 4 – Ann’s Take

So apparently there’s a competition brewing between the two of us.  I said it on Twitter, but I’ll say it again: the only victory here (for me) is the learning and growth opportunity I receive from developing the makeover and the subsequent review of peer work.  It’s also the victory of having a portfolio of visualizations and something I’ve found to be very interesting: gauging the reactions of others to my vizzes.  In the world of being a data communicator/visualizer, resonating with your consumers (audience) is critical.  That’s probably what I’ve appreciated the most – seeing what gets noticed and what doesn’t.

Okay – now that the whole competition bit is out of the way, it’s time to dig in.  Christopher’s viz this week is very functional, so from an audience perspective I’m not enticed or immediately captured by the data.  I do think the data elements that are captioned out are eye-catching and I like the shape they’re creating.  The map does a good job of orienting me to the Matamata-Piako region via the annotation.  Nice that there are particulars about the geography.  The picture doesn’t add to the overall viz for me, but maybe I’m missing something regarding New Zealand.  There’s no interactivity with the data, it is very WYSIWYG – a static snapshot of data.

Moving to the mechanics and overall analysis.  I would be doing a small injustice to a lot of the chatter on Twitter this week if I didn’t mention that there’s a little bit of concern on how the RTI is being used as a measure.  The box and whisker plots are showing the RTI for domestic or international visitors for each region by year.  The RTI for each region is summed for the year and this makes me uncomfortable.  What’s kind of interesting is theoretically it “doesn’t matter” because if we were to average and divide everything by 12, the data shape would be the same.  BUT (and this is the big “but”) 2016 doesn’t have all the months.  So unfortunately the districts looks like they’ve been awesome except in 2016.  From a visual perspective this can be seen by comparing the companion line chart (RTI by month) below.  This shows a steady increase for international travelers perpetuating year over year.

I feel the same way about the caption stating that it’s RTI is 37,401 in the map.  And I am slightly bummed that it seems like the regions are colored by name, a potential missed opportunity!  Here – if all data points were used, then the coloration of RTI (when summed) would accurately represent which regions are hotter spots overall.

I’ve noticed an overall trend in Christopher’s style: he likes minimal chart labeling and leverages annotations.  The minimal labeling helps by freeing up canvas space between the box plot and the line chart.  There’s now space for the color legend and it couples the two charts together more seamlessly.  I also appreciate the care taken to make an information button explaining what the RTI is.  And as always – I think Christopher has demonstrated multiple times that he likes to help the end user zero in on what’s critical.  Here the critical elements are clearly: 1) Matmata-Piako district is a huge tourist attraction, one that by box plot standards is considered an outlier.  2) Seems like this insanity started in 2011 and keeps getting more prominent.  3) It’s not even the biggest or greatest place, so again “what’s the deal?” 4) It’s in the northern part of New Zealand, and we now have a nugget of geography that will be retained.

I would go so far as to say if asked this trivia question in the future you’d be able to answer it given multiple choice options: “Which northern district in New Zealand boasts the highest amount of tourism although it has a population of only 34k?”

Finishing up here, I’ll end by saying this:  each week of data comes with a new set of challenges and obstacles – this one seems to have had a few new landmines that tripped up most of the community.  I’ll be interested to see if Christopher finds anything within my makeover.

On my process…Christopher’s take on #makeovermonday week 3

By popular demand (read because Ann practically BEGGED me to) I will go through my process for this week’s Makeover Monday…and my process in general with how I interact with and present social data.  A quick shoutout to my friend and Tableau social buddy Michelle Wallace who did an amazing presentation at TC16 on some of the concepts I’ll cover here in greater detail.

I have been working with social media (Twitter, Facebook, Blogs, etc.) data for the past year.  Typically there are 2 ways to go about accessing this data: through a third-party vendor or through the service itself.  Third party vendors usually aggregate things and export in nice neat csv’s or xlsx’s but can be expensive.  I personally like going directly to the source.  When I first started working with this data it wasn’t long before my searching on google led me to a host of other Tableau developers encountering similar issues: how to glean value out of social media data.  I noticed very quickly that a lot of folks were utilizing Tableau’s Web Data Connector to connect directly to the Twitter and Facebook data stores.

A great starting point is hitting up the Tableau Junkie (also known as Alex Ross) for his blog on creating a Twitter Web Data Connector

A forewarning this connects to Twitter’s Public Search API which is limited to the latest 5000 tweets, for anything more than this you will have to purchase Twitter’s “firehose” data stream called GNIP.

Once you bring in the data you get a similar set of fields like we saw on this week’s Makeover Monday.  The key to social media analysis is asking the basic 5 questions: who, what, where, when, and why.  We want to know who said it, what they were saying, where they said it, when they said it and all of this hopefully combines together to get at the why they said it.  A key ingredient to this equation is the “what” portion…in our data that was the “Status Text”.  There are a lot of ways to glean keywords and hashtags using regex and other methods but I wanted to see the frequency and prevalence of the exact words…ALL of the exact words.

My first thought was to use the “Text to Columns” function in Excel.  However, I quickly realized that each tweet might contain 10, 15, 20+ words and multiply that times 5,000 tweets at a minimum (if only using the Twitter Public API) we are talking 100,000+ rows of data, not the easiest thing to drag down in Excel.  Then even if we could get all those words we just have the words I wanted to be able to see what those words connected to, who said them, when were they said…and eventually be able to derive my own sentiment analysis using those words…a little more googling and VOILA: ALTERYX!

Alteryx not only has a text to columns, but the key was a text to “rows” function.  For example if you have a tweet that reads:

“Today is a good day”

What should return is:

Text                                               Word

Today is a good day          Today

Today is a good day            is

Today is a good day            a

Today is a good day            good

Today is a good day            day

You get the picture.  A little clean up and employing the NLTK (Natural Language ToolKit) stop words ridding us of those nasty “the” and “me” and “and” and now we’re starting to get down to the meat!

One of my absolute favorite features in Tableau is the word cloud…a very simple visual utilizing the tree map and changing the mark type to “Text” for more info see a step-by-step how-to here.

A final piece that I could not get to work in a timely fashion was the search box parameter (thank you Matt Francis for the great trick) but there’s some magic needed to be able to create an action on the word cloud, the trend line, AND make the text searchable.

Needless to say I think this is a very helpful way for anyone to be able to explore the data at a high level and then drill down into the content of the posts.


Makeover Monday Week 3 – Ann’s Take on Christopher’s Viz

We’re on week 3 of Makeover Monday and I think we can almost officially call this a habit.  I can’t reiterate enough how awesome it is to have a partner in this mission.  It keeps me engaged and accountable and I am very grateful to Christopher for that.

The spirit of participation was definitely necessary for me this week since the topic of viz was Trump’s tweets.  I won’t bring my political views to the post, but this ranks lower than visualizing sports data in my book.  (sorry!!)

Anyway – on to the viz!  We’ve got a fair amount of whimsy this week.  Right off the bat a line chart is being used to display Donald’s tweets and simultaneously resembles noise.  The placement and choice of photo makes it even more clear to me – he’s throwing more and more loud, erratic tweets out into the universe (hey, that’s what the line chart shows).

Moving through the viz, there’s a word cloud to get tweets to show up.  I like this concept.  Clicking on a word, I’m greeted with a full list of accompanying tweets.  Simplistic by design, but easy to let time slip by looking through tweets.

I clicked on ‘ever’ – and got a whole list of tweets that start with “Did you ever” and “Do you ever.”  And I now seriously believe that this is a word that Trump uses quite frequently for emphasis in his vocabulary.

I like the position Christopher took on this.  Display a little bit of how verbose Donald Trump can be, but allow maximum flexibility for someone to explore the data through guided keywords.

Now on to the workbook, which I am delighted to do this week because I now know that I’m being mocked regarding the arrow comments and this week Christopher has gone the path of using the arrow shape as a shape and not a picture.

Does this mean he’s becoming more structured?  Is there anything floating in this dashboard?  Alas, there are several floating objects, which makes me realize that I need to explore float for real.

Digging deeper, what I’m really intrigued by is how the words were extracted from the entire tweet text.  Looks like this was done in Alteryx and added in.  So Christopher, we’re dying to know a little bit of detail on the Alteryx data processing that occurred!

As someone who is actively interested in learning more about Alteryx and all of it’s power – please pass on to us some of this wisdom!

I’m keeping this post short, but I want to pause and comment on evolution of style.  I can already see how Christopher’s style and creativity is shifting and changing as the weeks go on.  I’ve experienced this myself (so maybe Christopher you’re going through the same thing) – but participating in these week after week is becoming easier and I am feeling more comfortable creating what I want out of it.  The paralysis of living up to the community is dying down and I am able to get down to vizzing much faster.

I can’t wait for next week (and to move on from this week)!


On collaboration…or a loooooong way to Christopher’s take on Ann’s Makeover Monday Week 2 Viz

Perhaps I am the child of my generation…riddled with ADD, chasing endless rabbit trails, searching for the source of the mysterious light that appears and then disappears…squirrel!  Ann’s comment about “brain sweat” had me googling to no end (ahh the wonders of the internet)…to which I found these 3 images:

this is what I imagine Ann’s brain  looks like…

this was just too funny to pass up…

and this was from a short and insightful presentation on the importance of creativity and divergent thinking here.

With all that said I echo Ann’s sentiment that I am consistently amazed week after week how different our visualizations are and how the techniques and creativity gleaned from each other’s interpretation are invaluable!

So here we are week 2…Ann has done an incredible job detailing her process in creating her dashboard here  so i will not spill any more virtual ink trying to get at the what and the why behind the visualization.  Suffice it to say…

At first glance the dashboard appears a little busy and plain.  I’m not a big fan of “serif” fonts so seeing them in the titles was a bit lackluster.  Having seen some of the other Makeover Monday submissions prior to and after viewing Ann’s viz I was expecting some cool backgrounds and colors and “Apple-esque” design…then it hit me, much like Apple the beauty is in the simplicity and more so in what’s just under the surface!

Ann has already admitted that she cheated by adding secondary data sources…so I will not beat a dead horse, but just so you know Ann: I WIN!!!  All jokes aside the intuition to glean the data from Statistica on global and US smartphone trends was pure genious.  Leave it to Ann once again to drive the story with the data and creative captions.  Unlike my visualization nothing is left to the end user to decipher or dive in to.  A clear and concise visualization concentrating on one metric to tell the story, and what is MORE to give Apple the benefit of the doubt, unlike the original viz (probably by some Mac-haters) that though the trend appears that the iPhone and Apple in general may be slowing in their astronomical growth maybe the market is just saturated.  I am thankful for this beautiful simplistic design and the ingenuity to weave a complex story and create a different narrative than a surface reading of the original data.

So here’s to brain-sweats all around!

Makeover Monday Week 2 – Ann’s Take on Christopher’s Viz

This week’s Makeover Monday data set was the quarterly number of iPhone units sold by Apple dating back from FY Q3 2007 to Q4 2016.   The original viz and article bundled with the data set hints that Apple is struggling in more recent times.

As always, my goal in these retrospectives is the same: first approach as an interactor, someone who hasn’t been exposed to the data.  And along with that first blush interaction, document my thoughts/feelings and process.

The viz starts off very direct and journalistic.  Did Steve Jobs’ death make a difference?  The difference to “what” is described along additional text throughout the visualization.  Reading through leads me to the conclusion that it is iPhone sales and toward the bottom that’s confirmed through the more specific question.

The death date of Steve Jobs is clearly marked and the pattern of data nicely changes in conjunction.  Quarters are beginning to dip lower and lower below the X axis – what I am interpreting as zero.  If pressed to answer the question, I’d say “Yes” there is clearly some sort of impact of Jobs’ death on the fluctuation of quarterly iPhone sales.  But… that’s not where things end.

My analyst side is getting comfortable as I spend more time interacting with the visualization and I realize that Q1 isn’t displayed for any of the years.   And spending time in the tooltips I notice that the length of the lollipops must be based on the direct change in sales of units.  This seems to be making more recent years much more dramatic than when I look at the accompanying percentage changes.  A good example of this would be Q3 2010 & Q4 2010 compared to Q1 2016 & Q2 2016.  The percentage change between the first pair is 68% net positive with an increase of 5.7 million units.  The percentage change between the second pair is -31.5% (half the 2010 variance) and 23.59 million units.  So my brain is interpreting half the variance, but my eyes are seeing a bar 4 times as long.    Depending on which measure I’m using as my indicator, there’s two competing stories here.  We’ll leave it to Apple executives for the ultimate decision.

Now diving into the technical side of things aka the workbook itself.  The first thing I notice: we’ve got the arrow again!  Is this Christopher’s signature style?  Adding a floating image version of shapes?  It is definitely a handy trick I intend on stealing.  Next I notice the care taken to make a custom color palette.  I like how it is put into action in conjunction with the gray background.  The Apple aesthetic is being captured.  The last thing I notice – the line of Jobs’ death date is a sheet!  This is blowing my mind.  Once again my structured mind would be churning on how to make this part of the overall underlying viz and here Christopher has shifted into a different dimension by simply floating the line on top.  It’s these moments in our collaboration that I can feel my brain sweating.

I urge you to check out Christopher’s full visualization on his Tableau Public profile.


#makeovermonday Week 1 – Christopher’s take on Ann’s Viz

Let me begin by saying that this blog/challenge/dynamic duo has been driven by the enthusiasm, drive, and passion of my cohort Ann!  The idea sparked in my mind at TC16 when she showed off her Tableau Public and my first reaction was to nit-pick and criticize…when in reality I realized my Tableau Public page was empty and I had NEVER done a Makeover Monday!  To my surprise, and delight, Ann came right back and said “where is yours?” to which I sheepishly replied “I don’t have any” :'(…and then her words were like a bitter medicine, necessary but painful…”then do one!”

So here we are one down 51 to go…100% Club here we come!

After Ann’s gracious feedback yesterday I am once again overwhelmed with the task of rising to her ever increasing over achievement!  So here is my humble attempt:

Right off the bat I have to say I’m  a sucker for good color/color schemes!  Ann nails this.  Tableau out-of-the-box gives a wide variety of color palettes so this can often be a difficult task to manage the line between beauty and overkill.  In this same vein the consistency of colors within the viz with the lines and font is a very nice touch.  After reading an earlier comment about the “floating legends” I see that she put the text box and legend in a horizontal container below the viz, this is an excellent technique when designing for the web/server/mobile/etc. as you never know how the visualization will be rendered in its final state when you employ the floating tiles.  The last technical piece I will applaud is the use of dynamic filters:

Not only is this an extremely creative way to employ the calculated fields used and describe the visual BUT it paints the picture so beautifully.

And with that we move on to the greatest aspect of this Makeover Monday: the story!  The ability to tell a story and communicate something so vividly is an incredible talent.  Though Ann herself does not have a daughter, the way she draws you in with the title and the once again, ingenious way she uses the field names in the tool tips, shows she is genuinely concerned about this unknown Australian daughter she might have had in a former life!  She brings the data to life, opens your front door, and sits down on your sofa to have a cup of coffee with you.

If this is the caliber of vizes we can expect from Ann Jackson this year you better buckle up your seatbelt because we are all in for a RIDE!  Looking forward to seeing what else she comes up with.

To see this and all of the rest of Mrs. Jackson’s creativity peek your head in at!/ and follow 🙂

Stay tuned for more next week…until then

Happy vizzing,


Makeover Monday Week 1 – Ann’s Take on Christopher’s Viz

Christopher and I have both challenged ourselves to be part of the elusive 100% club that makes up the Tableau community’s social project: Makeover Monday.  I’ve personally read through the attestations from several of my peers within the community on how their involvement has positively influenced their work and shaped them on their data artist journey.

To add further depth to the social project, Christopher and I thought it would be a good idea to take a look at each other’s Makeover Monday work and give our general impressions and opinions.  As I stated it during our recent phone call, the idea was to first take on the role of an end user or interactor.  Walk through findings and perceptions.  Then dive in from the perspective of a Tableau developer (or better said: fellow data artist) and see some of the technical components and how the visualization was made.

Now that the background is out of the way, it’s time for me to dive in to the viz.  The first thing I immediately appreciate is that this seems like the viz is set up for data discovery.  There’s a little bit of guidance in terms of what the subject matter is (and I happen to know since I vizzed it too), but other than that – it’s really up to the end user to explore the data points and understand what’s going on.  I fully appreciate that all data points are plotted because there’s a lot of good data to ingest.

My eyes were immediately drawn to the smaller (more zoomed in) Above $300k scatterplot.  I liked how quickly I could see the spread of gender within these jobs.  I was struck with the thought of “Really?  Only 2 female occupations above $300k?”  And I could immediately answer that by determining what they are.  After some additional investigation, my mind pondered to the idea that really most of these jobs in this section were very specialized and had few employed individuals.  I found myself justifying (if I can honestly say that as a feminist) that maybe the data for the 26 female Neurosurgeons was skewed a bit downward because the male number of the same occupation was 142 and we ARE talking averages.

Moving on, the next striking part within the viz was the massive number of female office workers, dwarfing any other occupation gender combo represented.  That data point caused me to trail back the jobs from largest number of individuals back toward 0,0.  I found myself nodding at the titles and thinking “I’m not really surprised by that # of individuals.”  Reinforcing some of the general gender norms I’ve grown up on.

Last I took a look at the Below $70k section.  It seems to me that it is filled with tons of female occupations and doing some investigation, comparing the occupations between male and female seemed to always lead to female workers being paid less.

Now for the technical take:  I dove right into this workbook.  The first thing I wanted to know was about the arrows.  I’m a very structured developer and so I wanted to see what the arrows were made from.  They turned out to be floating images.  Doing some additional digging, I realized that they were the arrows that come bundled with Tableau (shapes folder).  I chuckled to myself thinking that if I were going to do the same thing I probably would have made a sheet with the shape set to a calculated field or perhaps MIN(Number of Records).  Great reminder to myself: keep things simple.

Next: I had suspected that the large scatterplot had clustering applied to it.  So I went into the main sheet and validated.  This led me to see that the created clusters were actually used as filters for the two mini-scatterplots (also floated!) in the viz.  The last sheet to look at: the gender legend sheet.  Let me say this: I appreciated having this immensely.  (Mini-tangent) I tend to get confused with the gender shapes and having them reinforced with the legend and in some of the smaller vizzes with pink and blue was helpful to my mind.  Using double encoding on a data point (pink = female, plus shape = female) adds integrity and trust to my visual inspection process.

Summing it all up.  Things I really liked and appreciated: double encoding, built for data discovery, no enforced or projected point-of-view, subtle chunking of the data.  What’s still lingering on my mind?  If we had demographics dropped on the clusters and added narrative language to describe them, what would they be called?  Would Cluster 1 (dark blue, high income, low # of individuals) be best described as: Australian men have the market cornered on high-paying, highly specialized medical jobs.  And oh by the way, good luck marrying one because there’s only like 50 on the continent?

All-in-all a great start to Makeover Monday 2017.  And what’s more interesting, the giant juxtaposition between my approach and Christopher’s.

I encourage you to check out the full viz on Christopher’s Tableau Public.