Press J to jump to the feed. Press question mark to learn the rest of the keyboard shortcuts
ArchivedModerator of r/dataisbeautiful

[Topic][MIBM] Make It Better Monday — Anybody can ask for critique on how to make their work-in-progress better, or ask for the best way to visualize something!

Anybody can ask for critique on how to make their work more visually stunning, or ask for some basic "How do I visualize this?" help. If you have general tips you'd like to share as well, feel free to make a top-level comment!

Beginners are encouraged to ask for basic help, so please be patient responding to people who might not know as much as yourself.

To view all Make It Better Monday threads, click here. To view all topical threads, click here.

Want to suggest a biweekly topic? Click here.

88% Upvoted
This thread is archived
New comments cannot be posted and votes cannot be cast
Sort by

What softwares do you all use and how hard is it?

1 point · 6 months ago

I'm comparing text-to-speech providers in order to assess which one understands European Portuguese the better. I have the results from about 3k people reading a script and classifying the resulting text.

Now I wanna plot it visually, as in, how many times all providers correctly translated the voice to the corresponding text, how many times none did and all in between (provider a and b but not c, a and c but not b,...)

Funny thing, a month into this and I still don't have a clue about how to visualise the data other than a table. Any help? Thanks.

OC: 5
1 point · 6 months ago · edited 6 months ago

Hi, for practice I'm trying to visualize movie ratings vs production budgets, worldwide gross and domestic gross.
This is an example of what I have so far.
How do I improve on this?
How do I decide which should be the x or y axis?
Should I plot all 3 datasets on the same scatterplot?
Should I annotate each point?

edit; Here's my d3 chart

Hey all. Sort of a low-level random question, but, I figured this would be the best place to ask. I'm trying to map out a mildly complicated schematic depicting a phone forwarding scheme using regular phones and Google Voice numbers for my workplace. What would be the best tool to make a visualization for this forwarding scheme? Thanks!

Hey there,

New to this sub, but am a general Reddit lurker. I just started my career as a Research and Data Manager at a luxury real estate company. Some of these graphs and whatnot I’ve seen on the sub are absolutely stunning. In an effort to impress my superiors, can anyone provide me some tips or links to programs to help visualize all this data? I don’t know what applications/programs the people on this sub use, but they are incredibly stunning and I would like to take a stab at creating something like that for my company. Thank you for your time.

OC: 46
3 points · 6 months ago

Personally I do R/ggplot2, but there are others.

Common /r/dataisbeautiful tools used:

  • Excel/Libreoffice/Google Sheets/Numbers - Typical spreadsheet softwares with basic plotting functions. Easy to learn but often gets called out for being corny or low-effort. It's also very "canned" and doesn't have a lot of basic functionalities that offer quality statistical representations (e.g. boxplots, heatmaps, faceting, histograms, etc.).

  • Tableau - Simple learning curve that offers more than a few basic plotting functions, and also allows interactive plots. Software is proprietary and "canned" and will cost you some. Maybe some more folks can elaborate what it's like to use, but this is my impression after hearing basic information from other users and witnessing lots of Tableau OC.

  • R (and by extension ggplot2) - R is my personal favorite, but one of the more advanced FOSS packages. The R (with ggplot2) code has a huge capability as a statistical engine and is used in a lot of parts of industry. This comes with a sharp learning curve, however. It can generate beautiful visuals, but it takes time to learn.

  • Python/matplotlib - FOSS. This is when you get into the raw code aspect of dataviz. Python is popular among software and FOSS fans, including but not limited to xkcd; and matplotlib is one of the packages that allows for plotting.

  • Gnuplot - Worth mentioning since some OC here is gnuplot based. Medium learning curve. However this software is not really well-supported, and the visuals don't come out too hot.

  • d3.js - FOSS, I think. Good for delivering high quality interactive plots. However the learning curve is steep. As is the case with R, it's capable of generating very high quality interactives.

As always, see if you can browse some of your favorite OC to see if there is a common thread among visuals that you like. All OC threads must state the tool they used (and OC-Bot will likely have a sticky to it), so if there's a lot of viz you like that's made with (say) Tableau or R, then that software is probably the right one for you.

You just made my day and were incredibly thorough and helpful. Thank you so very much!

I'm looking into the possibility of making a mod for the game Civilization 6, one that would create a turn-by-turn network graph (or social network graph?) that would visualize the groupings and quality of the relationships between civilizations.

In the game, these are usually dictated by "diplomacy points". For example, if I were to agree to open up borders with Japan, Japan and I would improve diplomatic relations between each other by, say, 3 points. If I were to get caught spying on them, the total would go down by 9 points.

Needless to say, there are relationships between many civs happening at once, with many factors determining the point totals between them at any given turn. I thought a network graph would be the best way to show which civs are in similar families (i.e. an Axis Powers, or NATO, etc.)

My question, then, is what is the best tool to use to make these sorts of network graphs? I'm hoping to find a way to have the game spit out data, populate an excel sheet, and then have the program create the graph and send it back to the game. Thanks!

OC: 46
1 point · 6 months ago

I hear that gephi is a nice platform nowadays.

How would one visualise the movement of cash between the various crytocurrencies over the last 6 months?

I would particularly like to see how money moves out of one coin and into another.

I'm imagining a moving bar chart, with coins across the x axis and market cap along the y axis.

It would have to be animated, with pauses for various big announcements or IPOs, and a ticker for total market cap of all coins included.

Any ideas, or anyone willing to take a stab?

This is interesting, and would be really cool to see. I'm curious about the data source though: how are you going to determine that one coin was converted to another? That's typical of converting Bitcoin to some other alt-coin, but I don't know how you'd track something like NEO -> Ethereum since it would have to go through a Bitcoin exchange first.

OC: 1
2 points · 6 months ago · edited 6 months ago

Planning to track my running progress against my training plan. I also have resolved to run 1000 miles in 2018. I'd like any constructive criticism on this visualisation.

Here it is.

OC: 46
1 point · 6 months ago

Hey. Looks good, however I have a couple of critiques to unpack:

  1. The date format. I assume you're non-American? Regardless, Unitedstatesians and Non-Unitedstatesians alike should all use ISO 8601 format (relevant xkcd). It's standard in a lot of programming for obvious reasons, and is starting to take root in a lot of other industries as well. Probably a better thing to do: You can start the X axis as "Day number" and start ordering them sequentially (1, 2, 3, ...) instead of requiring everybody turn their heads to look at the date.

  2. Dual axes. They are disabled in other softwares for a very good reason, but Excel hasn't gotten with the times it seems. Your plot can be split into two plots and that would quadruple the beauty. Stephen Few has a good argument about it here. Dual scales really only make sense if you're converting one unit to another (e.g. Fahrenheit to Celsius, Miles to Kilometers, etc.).

    • Honestly, I would either split the graph into two plots, or simply rid the bars altogether. Line graphs are intuitive enough that you don't have to include additional spatial comparisons.

    • Another possibility (and this might only be available in different software) is to compare the difference between your training plan and your actual miles run, by displaying a ribbon colored red/blue based on behind/ahead of your plan. But that's just a thought.

  3. Direct labels for the lines. Instead of needing to mentally note "red is this, blue is that", you can actually see the lines and labels with direct labelling without having to look back and forth at the key.

  4. Font size on Y axis - Shrink that down. Y-label "Cumulative Distance (miles)" should be the way to go since most standards have you say "Quantity (unit)"... maybe a secondary axis for "Cumulative Distance (kilometers)" for ease of conversion. Add a title as well.

You can always do more with less. See if you can go through your plot and start removing different elements to see if they're really necessary. Some people do this too much, but more often than not people do this too little and end up with noisy charts. Less is more; remove to improve.

OC: 1
2 points · 6 months ago

I would love to get some feedback on how i can improve the data or which cool visualizations i could do with my data!

OC: 46
2 points · 6 months ago

Okay. A lot to unpack here.

  1. The first glaring issue is units. What are your units? Watts? Joules? BTU/hr? horsepower? kW-h? Kiloergs? Milliamps at 120V RMS? Chickens? Please display them loud and proud.

  2. Copypasting from above: The date format. I assume you're non-American? Regardless, Unitedstatesians and Non-Unitedstatesians alike should all use ISO 8601 format (relevant xkcd). It's standard in a lot of programming for obvious reasons, and is starting to take root in a lot of other industries as well. Probably a better thing to do: You can start the X axis as "Day number" and start ordering them sequentially (1, 2, 3, ...) instead of requiring everybody turn their heads to look at the date.

  3. Too much information is in this image, and I think there's a way to combine one or two graphs the next time you do it.

    1. Power consumption over 3 months might be a good place to start. Log the power consumption as a function of what you're doing each day. (e.g. how many joules did you spend gaming? how many joules browsing?)

    2. With a stacked bar graph, plot the power consumption; but this time use your gaming, surfing, idling etc. values.

  4. !pies, below:

OC: 1
1 point · 6 months ago

Thank you so much! I will work on that tomorrow.

Is there any additional information I can get out of my data? Like, I could read on which weekday the computer was on the most time etc.

Moderator of r/dataisbeautiful, speaking officiallyOriginal Poster1 point · 6 months ago

You've summoned the advice page on !pies,. There are issues with Pie/Doughnut charts that are frequently overlooked, especially among Excel users and beginners. Here's what some experts have to say about the subject:

  • In Save the Pies for Dessert, Stephen Few argues that, with a single rare exception, the data is better represented with a bar chart. In addition to this, humans are terrible at perceiving circular area.

  • ExcelCharts argues that the pie chart is simply a single stacked bar in polar coordinates, and that there are many pitfalls to using this type of visualization. In addition, the author also argues that pie charts are better displayed as bar charts instead.

  • Edward Tufte, data viz thought leader, states about pie charts "A table is nearly always better than a dumb pie chart; the only worse design than a pie chart is several of them, for then the viewer is asked to compare quantities located in spatial disarray both within and between charts [...]. Given their low density and failure to order numbers along a visual dimension, pie charts should never be used." (excerpt from The Visual Display of Quantitative Information).

  • Cole Knaflic in this article rants about her hate of pie charts, and boldly states they should not be used.

  • Joey Cherdarchuk in this article shows how easily pies can be easily replaced by bar charts.

If you absolutely must use a pie, please consider the following:

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

I'm building a graph tracking my roommates time at home (since he never is) and was wondering how on excel to make a graph that was a time line for the month where when here was here it's one colour and when not it's another.

Was hoping to also have it change as I altered the input data

Have two columns. The first is amount of time in hours that your roommate is in. The second column is 24-(the first column's number). This will give you the time that he's in and when he's out.

Label the columns 'Time in' and 'Time out'. Label the rows 'mon, tue, wed' etc.

Then choose a '100% stacked area' graph to display it.

I'm currently tracking my sleep each night using a horizontal bar diagram. The design of it may be similar to what you are looking for. I built this in Excel, if you want to know how I can help.

This is perfect, how did you build it on excel?

I have 4 values for this graph

  1. Start of day = 12:00AM

  2. Woke up at = Time recorded by app

  3. Go to sleep at = Time recorded by app

  4. End of day = 11:59 PM

The values need to be changed to something the graph can read. The values for the graph are as follows.

  1. Start of day

  2. Woke up at - Start of day

  3. Go to sleep at - Woke up at

  4. End of day - Go to sleep at

When those values are used to create a stacked bar diagram the values show a bar for the times that I actually went to sleep. What excel is doing is measuring the hours between things (which is why the graphical values look weird). I would recommend looking up how to create a [waterfall chart in excel] ( as the technique used to create the floating bars is the same used in my chart.

If there is a place where I could upload my excel file for you to see I'm happy to do that as well. Good luck!

What were your rise, fall and base values then? For me I need time here and time not here, do you have formula examples?

From the Waterfall chart you just want to take the idea of using a stacked bar diagram with "invisble" bars.

My bars without fill/border is the time of day that I am awake which is calculated from "Go to sleep at - Woke up at" (number 3 in the above). And then my filled bars, time spent asleep, would be the Rise/Fall values.

How would someone with little experience in this kind of field go about learning and creating graphs etc.?

Viz Practitioner
3 points · 6 months ago

Here's a copy+paste for a similar question. Original here.

Which of the following are you looking for?

a. Learning how to use a software to process and visualize data.

b. Learning the principles of data visualization (which chart should you use given the nature of your data)

c. Learning statistics to have a better idea of what the data means.

d. All of the above.

For (c), check the courses offered at Coursera, at edx, and the Khan Academy crash course.

You can say you've got a basic understanding of statistics when you know about: randomness, classic probability, bayesian probability, samples, data distribution, average/mean, mode, median, parametric statistics (based on a normal distribution) like t-test, Z-test, Pearson's correlation, one-way ANOVA two-way ANOVA, statistical inference. Then it moves to non-parametric statistics (non-normal distributions).

The most important part here is having a "statistical mind". Besides a regular textbook, I recommend "How to lie with statistics".

For (b) check the books by Edward Tufte, specially "The visual display of quantitative information", and learning about good graphic design principles, we also have some info at our wiki.

For (a) I recommend looking for courses on MS Excel (mainly to process data, not displaying it), R (to process and display), d3js (if you want to make dynamic and interactive displays), python (to process and display), Tableau (it's getting quite popular), etc.

Finally, I recommend you familiarize yourself with different types of data visualizations, for that I recommend this article and this site, and visit sites for dataviz for inspiration and ideas: Dark Horse Analytics, Five Thirty Eight, Minimaxir, several profiles like Colin Morris or Zonination.

Community Details





A place for visual representations of data: Graphs, charts, maps, etc. DataIsBeautiful is for visualizations that effectively convey information. **Aesthetics are an important part of information visualization, but pretty pictures are not the aim of this subreddit.**

Create Post
Best Of DataIsBeautiful
Posting Rules
  1. A post must be a data visualization.

  2. Directly link to the original source article of the visualization (not an image file) or tag the post as [OC] if you made the visualization.

  3. [OC] posts must state the data source and tool(s) used in a comment. Only tag a post as [OC] if you made it yourself

  4. Requests and questions must include a visualization. More info.

  5. No infographics or other unautomated diagrams. Infographic vs. Visualization.

  6. Compilations are not allowed.

  7. Post titles must describe the data plainly without using sensationalized headlines. Clickbait posts will be removed.

  8. Posts regarding American Politics, or contentious topics in American media, are only permissible on Thursdays (ET).

Please read through our posting guidelines if you are new to posting on DataIsBeautiful.

Commenting Rules
  • Comments should be constructive and related to the visual presented. Special attention is given to root-level comments.
  • Short comments and low effort replies are automatically removed.
  • Hate Speech and dogwhistling are not tolerated and will result in an immediate ban.
  • Personal attacks and rabble-rousing will be removed.
  • Moderators reserve discretion when issuing bans for inappropriate comments.
User Flair

Do you like contributing sharp-looking graphs? Are you an official practitioner or researcher? Read about what kind of flair is right for you!


Data from Star Trek? Data ARE? How do I make one? Read the FAQ

How do I make a good post? Read the guide

Related Subreddits

If you want to post something related to data visualization but it doesn't fit the criteria above, consider posting to one of the following subreddits.

  • Data_IRL: Feeling the need to be hilarious? Go here. Data.

  • SampleSize: Conduct and share surveys

  • Datasets: Request and share data sets

  • DataVizRequests: Request a visualization to be made from a dataset.

  • Visualization: Discuss and critique the design and construction of information visualizations

  • MapPorn: Share maps, map visualizations, etc.

  • Infographics: Share infographics and other unautomated diagrams

  • WordCloud: Specifically for sharing word clouds

  • Tableau: Share and discuss visualizations made with Tableau software

  • FunnyCharts: Share funny graphs and charts

  • U.S. Data is Beautiful: for those of us who simply can't wait for Thursdays

  • MathPics: Share pictures and visualizations of mathematical concepts

  • RedactedCharts: Try to guess what a chart is about without the labels

  • Statistics: For all questions and articles related to statistics


Get the day's top posts on Twitter!

Sister subreddit: InternetIsBeautiful

Cookies help us deliver our Services. By using our Services or clicking I agree, you agree to our use of cookies. Learn More.