Press J to jump to the feed. Press question mark to learn the rest of the keyboard shortcuts

What's the best tool(s) to plot ~10000 points with labels and not have the labels overlap?

I looked at everything python has to offer and haven't found anything solid. I've been using pyplot to make the plots and it can do 10000 points with labels no problem, the issue is that many of the labels to the points overlap.

There is package called adjustText to change the positions of the labels so that they don't overlap, but seems to handle at most 3500 points, anything beyond that and Google Colab is not able to process the graph before the time limit for a session is up (12 hours), even on GPU mode.


Link to dataset:

I'm looking for a hierarchical visualization that captures the relational aspects of the dataset. D3.js is my preferred tool for the job but I'm open to ideas. I've played with basic dendograms and network graphs. Biggest issue is that the dataset can overwhelm most displays, I need some way to group them together (based on hierarchy).


Does anyone know of any research that looks at what kinds of visualizations different user types typically like to see? For example, do CEOs typically use different kinds of visualizations than workers in their organizations?

It makes sense to me that different user types would prefer different visualizations, but I can't find any research to prove or disprove.


Hello! I'm sitting on a ton of data points and am looking for a way to plot/chart/graph them that will be visually interesting and informative to our users. It is essentially just a long data set of finishing times for an online competition, containing the timestamp when it occurred, total finishing time (0 for a failed attempt), whether they finished or not (1 for yes, 0 for no) and, if the user was logged in, their username (blank means they did it as a guest).

I'd like to make this data publicly available to our users in a meaningful, infographic sort of way to give them an idea of (a) what the overall distribution of successful finishing times was, (b) how many people succeeded vs. how many failed, (c) who the top finishers were (by lowest time where there's a username), and maybe (d) break down finishing-time percentiles. I'd also like to (e) somehow compare the statistics for logged-in users (if a username is showing) vs guests (no username). And of course open to any other ideas you might feel are visually interesting or informative.

Reddit gold goes to the best solution. :-)

Thank you so much for your help!

Dataset is here in csv format:

1 comment

What would be the best way to visualise a table.


I wanted to make a US Map that visualized death from drug poisoning from CDC data. I want to make one that compares 1999 vs 2016. I am okay in R -- def have a LOT to learn. This is my first time trying to create a visualization like this. I am using this guide to help me.. The very last section of my code is giving me the following error:

Error: geom_polygon requires the following missing aesthetics: x, y

I am 99% sure the x and y aesthetics are latitude and longitude from the "us" variable.

I know I am going to have to adjust the theme and add the title, and pick a gradient for the rate, etc. I just want to get something first to play around with. Thank you!


drug_deaths_1999 <- drug_deaths %>%
  select(State, Year, Deaths, Population) %>%
  filter(Year == 1999,
         State != "United States") %>%
  mutate(rate = (Deaths/Population) * 100000) 

drug_deaths_1999$State <- tolower(drug_deaths_1999$State)

drug_deaths_2016 <- drug_deaths %>%
  select(State, Year, Deaths, Population) %>%
  filter(Year == 2016,
         State != "United States") %>%
  mutate(rate = (Deaths/Population) * 100000) 

drug_deaths_2016$State <- tolower(drug_deaths_2016$State)

states <- map_data("state")
states$State <- states$region

drug_deaths_1999 <- inner_join(drug_deaths_1999, states)
drug_deaths_2016 <- inner_join(drug_deaths_2016, states)

us <- ggplot(data = states) + 
  geom_polygon(aes(x = long, y = lat, group = group), color = "white") + 
  coord_fixed(1.3) +

ditch_the_axes <- theme(
  axis.text = element_blank(),
  axis.line = element_blank(),
  axis.ticks = element_blank(),
  panel.border = element_blank(),
  panel.grid = element_blank(),
  axis.title = element_blank()

## this is not working 

us +
  geom_polygon(data = drug_deaths_1999, aes(fill = rate), color = "white")+
  geom_polygon(color = "black", fill = NA)+

I think I caught a contractor lying about her location for YEARS.

Before I bark up the chain, I want to boil it down to something easy and visual for my boss to digest.

He is not one for spreadsheets and I need IMMEDIATE action taken since this contractor is in the healthcare segment. My boss literally reacts in minutes to graphs.... hours or days will go by if given a spreadsheet.

So... how do I do this?


IP addresses are largely the same for each facility. For example: is always Office A, is always Office B, with rare exceptions. (And very easily explained exceptions, as I get the contractor's logs daily.)

I was thinking a chart? But maybe an animation might do a better job telling the data story?

Any help is appreciated!

Tools I have: Ubuntu Linux, Inkscape, GIMP, Libreoffice Spreadsheet which is very close to excel. Open to downloading a program to do this, if need be.


So the dataset will be exclusively from the titles in r/Igot out. Country names and shrotenings. Denmark; DK , Neatheands and/or Holland; NL. Etc etc.

Just collect the names like the opposite of a wordcloud on a world map.

If you feel adventurous, try to add some filters and stuff. Take into account title sentences such as “ I will go to either x, y or z” where more than one country is mentioned. “US to Europe” where Europe is a region/continent. And where “from x, to y” is an indicator of where they moved from to. Include synbols such as “->” for where they moved to from.


I have sales revenue for four companies that I want to compare over time, while also comparing the growth from one period to the next (quarterly in this case). I currently have a combo chart with columns representing the revenue, and line representing the growth but it feels way to busy for quick interpretation. Any other potential ways to display this?

1 comment

Hi all,

I work for a high-tech company and I'm trying to map project dependencies across teams.

I have a data set that contains a list of projects along with the teams working on them, as follows:

"Project 1": "Team A", "Team B", "Team C"
"Project 2": "Team A", "Team C", "Team D"
"Project 3": "Team B", "Team D", "Team E"

There are about 40 projects and 50 teams.

My goal is to create a visualization in which teams are nodes and projects are links. Hovering over the link should show the project name.

The end goal is to see clusters of work so that meetings can be scheduled to coordinate activities across groups of teams that work the most together.

Any advice? Thanks in advance!

1 comment

For instance I have two shampoos, a dry scalplotion and a night cream. I used these 70-95% correctly. I need to able to showcase this to my m.d. And dermatologist to make convincing arguments. Wether on Ipad mni ios 9.3.5 without excel and most appstore apps, o printous or if I fix the screen nex week, surface pro 3.

I have 20+ habits and I need 7 of them. Ish. Some bad some good. Like food in bed(bad), vs., applied shampoo #2(good).

No pie charts I think. Just a highschool graph with lines .

Exporting to excel gives 1column with weird rows for each entry. I have zero idea where to start in restructuring the data. I know very little html, and 2003 and 2010 vesion excel .

How do I convert exported .csv Habitbull data to nice graphs with multiple habits?

1 comment

Sorry for my question context, I couldn't figure out how to word it.

Basically, every minute I am fetching what game I am playing/what channel I am in etc.

I want to display a graph that shows (from 9:00 - 10:00 he was playing League) (9:30 he was playing CSGO) etc.

I was using a stepped area graph to achieve this:

However, with this graph, if the other 2 are false(Which they will always be) it fills in the rest of the area. I want it to remain in in its range (1 tall, X wide[depending on time online])

If anyone knows of a better way to visualize this, that would be great.


I am playing around with some visuals for Maryland's primary election data:

In particular, I would like to create a county-level heat map for different pieces of data. But I've never done a map and don't even know which tools can do that. My current skill set is basically only Excel (with the tiniest start in PowerBI).


A small survey has been done on /r/tall to gather data about the subscribers and make some unscientific correlations.

I think that it could be quite interesting to have a way to visualize it but my spreadsheet skills to make charts are a bit clunky.

Link to dataset:

Description of what I am looking for:

I'll leave it to you to choose the best representation of the data you can also post it on /r/tall for the karma :)


Our publication, Dig3st, is hosting a writing contest for submissions that are < 3-minute reads and include at least one element of data visualization. Winning submission will receive $300 and announced on our Twitter page. Rules posted here:

While there is no particular dataset for this contest, for the sake of providing a possible direction and link to a dataset there is plenty at to be discovered.


Several datasets will be used here. We will also pay. Hit me up via PM if you can with work samples. Thanks!


I am working with a dataset that puts each observation into multiple categories. I am trying to find a visualization to best represent how many observations are in each category and all sub-categories. I am currently using a sunburst chart, but I am looking for something better. Does anyone have any ideas?

1 comment

since the first 200,000 is such an outlier (over 2 years compared to the rest that are just days or weeks) it might make it much easier to view the visualization progress if that is left out or indicated in some other way from the rest.

Will be happy to gild the first couple who create a graphical representation of our progress.

Thanks in advance!

LC Progression Date Chart by Whit

If you are trying to figure out where we were in this counts progress on a given date, or trying to remember what month it was we were in the 4,000,000s. This PDC will be useful to you!

If you are trying to find a section of the MRT based on a certain day, week or month this table will help you figure out what count we were at at the time.

Special thanks to Questoguy and Chalupa_Dad who made this easy for me to make!

100k Date in Year/Month/Day
1st count 2014-07-23
100,000 2015-07-03
200,000 2016-09-13
300,000 2016-10-06
400,000 2016-10-24
500,000 2016-11-05
600,000 2016-11-12
700,000 2016-11-20
800,000 2016-11-28
900,000 2016-12-05
1,000,000 2016-12-11
1,100,000 2016-12-19
1,200,000 2016-12-22
1,300,000 2016-12-30
1,400,000 2017-01-07
1,500,000 2017-01-13
1,600,000 2017-01-22
1,700,000 2017-01-29
1,800,000 2017-02-05
1,900,000 2017-02-10
2,000,000 2017-02-14
2,100,000 2017-02-20
2,200,000 2017-02-24
2,300,000 2017-03-01
2,400,000 2017-03-05
2,500,000 2017-03-09
2,600,000 2017-03-14
2,700,000 2017-03-21
2,800,000 2017-03-26
2,900,000 2017-04-03
3,000,000 2017-04-08
3,100,000 2017-04-12
3,200,000 2017-04-17
3,300,000 2017-04-25
3,400,000 2017-05-03
3,500,000 2017-05-12
3,600,000 2017-05-19
3,700,000 2017-05-26
3,800,000 2017-05-29
3,900,000 2017-06-03
4,000,000 2017-06-06
4,100,000 2017-06-09
4,200,000 2017-06-14
4,300,000 2017-06-19
4,400,000 2017-06-23
4,500,000 2017-07-08
4,600,000 2017-07-22
4,700,000 2017-07-30
4,800,000 2017-08-06
4,900,000 2017-08-14
5,000,000 2017-08-23
5,100,000 2017-09-01
5,200,000 2017-09-20
5,300,000 2017-10-04
5,400,000 2017-10-18
5,500,000 2017-10-29
5,600,000 2017-11-04
5,700,000 2017-11-06
5,800,000 2017-11-21
5,900,000 2017-12-02
6,000,000 2017-12-07
6,100,000 2017-12-14
6,200,000 2017-12-26
6,300,000 2017-12-31
6,400,000 2018-01-05
6,500,000 2018-01-13
6,600,000 2018-01-19
6,700,000 2018-01-30
6,800,000 2018-02-09
6,900,000 2018-02-13
7,000,000 2018-02-18
7,100,000 2018-03-03
7,200,000 2018-03-21
7,300,000 2018-04-04
7,400,000 2018-04-16
7,500,000 2018-04-27
7,600,000 2018-05-04
7,700,000 2018-05-09
7,800,000 2018-05-17
7,900,000 2018-05-29
8,000,000 2018-06-03

Hey Friends, not sure what to post this so trying here.

My Girlfriends birthday is coming up‚ and we both enjoy data. So I thought it would be a cute gesture to throw all of our messages to each other in a database, and use some form of Data visualisation tool (Probably Tablaeu) to pull out some cool data.

I'm mainly curious if anyone has suggestions about how to structure the database. I work as a Software Engineer and have worked with Tableaueu before, so implementation shouldn't be too hard. But given what i'm trying to do i Imagine just putting each message in as a TEXT field is not best way to go about it.

I'm considering using MySQL, and think I basically want to create a structure where all unique words go into a lookup table and get their own ID, and then using a join tables between words and messages (possibly a table inbetween for sentences?). And have the join tables which retains track the index of words in a message/sentence etc. But yeah any input on how structure to make it easiest to analyse later data would be appreciated.

And just to specify, the main goal here isn't to reach some specific final visualisation, the point is more creating the dataset, so something that for example automatically creates a word cloud is not really what I want.


Hi everyone!

I am currently thinking about creating a lobby dashboard to represent energy (kWh) data. I want to highlight how much each building are consuming and how much carbon they emit. I am concerned that people will just pass by and ignore it. How would you represent the data so people would actually stop and be interested in this? Bubble chart with one bubble per building (bubble size changes depending on amount consumed) playing in loop? A map of the campus with buildings which consume the most in red colour? Simple year on year comparison?

Data collected: - kWh - location - time stamp

Thank you in advance for your suggestions! :)


Does anyone know what software I can use to make clean graphs like the ones seen in scientific literature?

Such as this one? Which has those bars for the upper and lower values?

I don't know why, but Excel graphs just have a non-professional look to them.


A friend of mine has set up a gofundme because she had quit her job and claims we need to fund her car so that she can look for work. I know she lives near a bus station, but I was hoping to make a topographical/color coded map of the state that signifies how far of a walk it is to the bus stop.

Also I think it would be neat for how far it is to other places like convince stores or McDonald's.

Any help would be appreciated. Don't really know where to start.


Hi all

I just graduated from college (hooray) and I've been tracking a bunch of data over the past 4 years. I was wondering what would be the best tools and design to visualize this data. I'm very open to trying out some new python / Java libraries.

Data collected:

Todoist - tracks my productivity

Toggl - tracks how I spent my time

Daylio - mood and activities

Fitbit - pedometer, weight, exercise, and heart rate

Calendar - tracks all my big life moments from exams and presentations to vacation and travel

Excel - categorized monthly income / spending


Pomodoro - more productivity tracking

It would be great to composite a lot of this information so I can see how different activities affected each other.

Thanks for the suggestions!


Hi everybody!

I recently stumbled over sankey diagrams in r/dataisbeautiful and used some of the referenced web tools to visualize ticket flows in IT service management. Our managment grew fond of them quite quickly and we want to make them a standard tool. I know there are various Excel plug-ins, but our requirements are often a bit non-standard, so I would like to understand how they are built and create something that is a bit more fit for purpose.

Does anybody have a code example or some helpful background on how sankey diagrams are generated? I can code in a couple of programming languages and read some more, so anything would be helpful.

I know this does not perfectly fit in here, but this is the only sub i could find that at least somehow fits the question.

1 comment
Community Details





A place to request data visualizations.

Create Post
r/DataVizRequests Rules
R1: Not a request
R2: No link to the dataset
R3: Bounty not delivered
Comment is unconstructive
Hate Speech
Cookies help us deliver our Services. By using our Services or clicking I agree, you agree to our use of cookies. Learn More.