I would like some help in visualizing some test score data. I have color coded the scores based on a range scale that corresponds to levels determined by the state. I would like to display this visually as how many students scored in each color code. I have some knowledge of how to use sheets and excel to create simple graphs but I feel this is a bit over my experience. Any help is welcomed. For the safety of the students, all their information has been deleted. Only the scores are shown. Thank you once again.
Here is a link to the data: https://docs.google.com/spreadsheets/d/1h62UYLCli6DnyAL2JK7iUudiK23oGovDRdwipCHXQo4/edit?usp=sharing
I've always enjoyed r/dataisbeautiful and I've been toying with the idea of making OC for the first time. Unfortunately I have no idea what I'm doing and I wanted to see if anyone might be willing to give me some pointers and answer a few questions.
I've looked at a few programs and read a few guides, so I have a general idea of how it's done, and I did some similar stuff in high school, but I feel completely lost right now. It might be because I'm biting off more than I can chew with my first project, or I'm just dumb. Probably both.
I want to replicate and re-visualize the data from this study:
My initial thought was the data I needed is contained in the text itself, and in fact it might be! Looking at the various tables made me wish I paid better attention in high school.. I don't think the data I need is in the text though.
After the sources are listed it has a section titled Supplementary Material that says :
Data/code for replicating results http://dx.doi.org/10.1017/S1537592714001595
That brings me to the Cambridge University Journal the study's hosted on. That page has a section titled Supplementary Material too, that link brings me to a bank of downloadable files that I was certain was what I needed.
I opened all the files and I don't know if what I'm reading is what I need.
It doesn't look like how I pictured a data set, it appears to be instructions and notes like a coding library or something. The 6th and final downloadable file is literal complete gibberish and random characters, as if I opened it with the wrong program.
This is where I hit my dead end, I really have no idea what I'm looking at. I've obviously never done anything like this and I lack all the general prerequisites of the average data artist. If you could tell me if this project is entirely beyond my scope, I would really appreciate that.
Alternatively if there's a workable data set somewhere and it doesn't require some sort of advanced math or coding knowledge, or if you can translate some of these tables and tell me how I can use them to replicate some of their graphs, I would really appreciate you.
If you decide you're intrigued and you would like to take on the project yourself, or help me create it you have my full support, too.
Thanks for reading and any help in advance
I'm in a rail journey without my laptop and I have an urgent submission due to make some aesthetically good visualisations. I was going to make these visualisations on tableau but don't have my laptop, so can't. The datasets are really small (they are not even a proper dataset, just an excel file with some rows and columns ) and will take the least amount of your time, just take a look at it once to decide if you can help, can someone make some cool looking visualisations. There is no special requirement for a specific type of visualisation, you can do anything with it just make it interesting and rad. Please help peeps of reddit, it is really urgent.
Link to excel sheet: https://drive.google.com/file/d/1vtUlSQA3sCXLSxw9YB47zv66_Yh3nmMu/view?usp=drivesdk
The learning stages are split up into 6 different learning stages (which denotes the stage in the behavior protocol that the mice where dissected) Homecaged (HC), Untrained (UT), 1st Training Trial (T-1), Retention Test (RT), Extinction Training (ET), Conflict Training (CT). The slices where imaged and ranked on an axis from most Dorsal to most Ventral (location relative to the brain), we used an imaging software to isolate and analyze the particles. We then ordered the counts into the bottom two tables in the data set. We calculated the ratios of the of the cells for each substructure (Ca-1, Ca-3 and Dentate Gyrus or DG). We found the average cell counts and average ratios for each learning stage (splitting them according to substructure) these are in the top two tables. My issue is that I lacked a large enough sample size with n=5 at most or n=4 for some of the learning stages. This meant that I could not do a T-test. How would you recommend I visualize the data and if anyone has any knowledge of how to do a statistical analysis and properly visualize it for this kind of data I would be most appreciative. My presentation due date this coming Wednesday and I have yet to come up with an idea for statistical analysis so any help or opinions with that as well would be highly appreciated.
Thank you for your time and help.
Should I be showing SD or display the results in a manner that represents the data better?
I would reward any good quality help or direction wiht an original sonnet and some reddit gold.
Link to dataset: https://vectorspace.ai/recommend/datasets/
From the site:
What kind of things can be done with custom concept columns & features?
Create unique sectors or clusters based on concepts and hidden relationships and compare their gains to the S&P (see below)
Determine if price correlations have similar concept or keyword correlations
Examine symbiotic, parasitic and sympathetic relationships between equities
Automatically create baskets of stocks based on concepts and/or keywords
Detach the custom columns and append them to other proprietary inhouse datasets
Select a Data Context (e.g. Biological, Chemical, Geophysical and others) to derive different signals
Use stock symbols as custom concept column labels and model cross-correlations between equities
Create features using trending terms anywhere on the internet
How do the concepts & trends correlate to crypto, stocks or ETFs?
Scores range from 0 to 1 and represent strength of known and hidden relationships between a concept and a stock, option or ETF. The score is calculated based on a series of algorithms that monitor data surrounding each company associated to the underlying security where each score is combined with scores from human curation teams. These concepts can then be factored or parameterized for exploring new signals or building new models. [Ref: http://www.cboe.com/rmc/2015/JPM_Correlations_RMC_20151%20Kolanovic.pdf ]
Click on "View Raw" to download the dataset.
"type 1", "Type 2" and "Type 3" are 3 categories. The names are clustered for each category (clusters are represented by the numbers in each category column). The clusters are not the same across different categories. Example: Cluster 1 in Type 1 is not the same as Cluster 1 in Type 2. The categories are further divided into 10 subcategories (metric 1 - metric 10) and the decimals represent the values of these metrics.
Aim: The aim is to use the cluster information to drill down to a name of interest and visualize the metric information.
My initial thinking: Use circle packing visualization to cluster the names for each type. When a cluster is chosen, show a heat map for the metric values of names in the chosen cluster.
I am looking for other possible and concise ways to visualize this information! Any suggestions is much appreciated.
I put together a matrix for various statistics by US state: https://docs.google.com/spreadsheets/d/1RWwROtd4d-OraIoaOX04klIc8IgIUPmUS1uLQx3pj6c/edit#gid=2053349644
Just in coloring them by Democrat/Republican, you can see a trend for most of the metrics. But what's the best way to visualize this? Just a series of bar charts (one chart per statistic)? What would you guys recommend.
Bonus points (+ reddit gold!) if you can give me an example or want to take a whack at it yourself.
I am looking for a (dynamic) visualization in form of a map that shows an approximation of noise "annoyance" due to air traffic.
I know there are several "official" websites that map noise (one example is https://www.umgebungslaerm-kartierung.nrw.de -- it is in german, select "Flugverkehr" on the left menu which means "air traffic") but I have not found one that shows an approximate for the actual air traffic. The example only shows noise generated close to the airports but not generated by planes flying over the city.
I think this is very valuable for making choices regarding where you want to live, rent something or buy land/housing and I am puzzled that there is no public visualization for that (I have found, enlighten me if this already exists).
Data seems to be available from https://www.adsbexchange.com/data/# or maybe other sources.
I have a visualization in mind that is eg. a google map/open street map with an overlay that (maybe even live in the browser) draws a translucent line of some width for all flights paths in the given map rectangle and thus intensify when flights paths overlap. You could also include height information and plane type to get it more accurate. I guess this is a lot of data, but maybe for a 50km by 50km rectangle and a week worth of data you will get a good picture of what is going on and I assume that those will be < 10k planes to be tracked/drawn.
Any ideas on how to do that if that already exists or wanna help doing it?
Looking for someone to help me map out all the boats running within all of USA Rivers including great lakes ....looking for none recreation boats ....more commercial boats barges etc...
Hi! I'm the tech lead on the Run for Office analysis and visualization project. We have this awesome data set of elected officials in Louisiana with city, parish, gender, race, and party. We would love it if anyone here could pull any interesting or useful insights out of the data in the form of visualizations!
We want to answer questions like:
Thanks in advance for anyone willing to help!
Specifically my chances of winning prizes in the powerball, colorado lottery and the rad riches $20 scratcher. Thes odds are listed on the website.
I am looking for the best way to graph the correlation between a predictive score and a manual label in sets of data over time. In the process, a system predicts the likelihood that a user will label a document as ‘yes’ or ‘no’, and provides a set for the user once a day. I’m trying to display the progression of the correlation between high scores from the system and actual calls by the user. But I can’t find an effective way to represent all three ‘dimensions’ of the data. The data looks like this:
Each date (15 days total) has four lines to delineate the four possible labels. Columns 4-13 show the different 10 point ranges of the system scores
What I’d like is to have the date on the x axis, the number of labels applied on the y axis, and use the label applied as an aesthetic to differentiate the calls being made. My first thought was a density plot, but that’s missing one more dimension to show the system score. Any help you can give with the best way to visualize this data would be greatly appreciated.
What is the best way to visualise hierarchy type of data ?
For Example: We have to display the number of cars which is as follows, Continent > Country > City > Car Brand > Number of cars.
So how would I visualise Each continent which is divided to each country which is again divided to each city and so on....
What's the best tool(s) to plot ~10000 points with labels and not have the labels overlap?
I looked at everything python has to offer and haven't found anything solid. I've been using pyplot to make the plots and it can do 10000 points with labels no problem, the issue is that many of the labels to the points overlap.
There is package called adjustText to change the positions of the labels so that they don't overlap, but seems to handle at most 3500 points, anything beyond that and Google Colab is not able to process the graph before the time limit for a session is up (12 hours), even on GPU mode.
I'm looking for a hierarchical visualization that captures the relational aspects of the dataset. D3.js is my preferred tool for the job but I'm open to ideas. I've played with basic dendograms and network graphs. Biggest issue is that the dataset can overwhelm most displays, I need some way to group them together (based on hierarchy).
Does anyone know of any research that looks at what kinds of visualizations different user types typically like to see? For example, do CEOs typically use different kinds of visualizations than workers in their organizations?
It makes sense to me that different user types would prefer different visualizations, but I can't find any research to prove or disprove.
Hello! I'm sitting on a ton of data points and am looking for a way to plot/chart/graph them that will be visually interesting and informative to our users. It is essentially just a long data set of finishing times for an online competition, containing the timestamp when it occurred, total finishing time (0 for a failed attempt), whether they finished or not (1 for yes, 0 for no) and, if the user was logged in, their username (blank means they did it as a guest).
I'd like to make this data publicly available to our users in a meaningful, infographic sort of way to give them an idea of (a) what the overall distribution of successful finishing times was, (b) how many people succeeded vs. how many failed, (c) who the top finishers were (by lowest time where there's a username), and maybe (d) break down finishing-time percentiles. I'd also like to (e) somehow compare the statistics for logged-in users (if a username is showing) vs guests (no username). And of course open to any other ideas you might feel are visually interesting or informative.
Reddit gold goes to the best solution. :-)
Thank you so much for your help!
Dataset is here in csv format: https://files.fm/u/ckax4ke7
I wanted to make a US Map that visualized death from drug poisoning from CDC data. I want to make one that compares 1999 vs 2016. I am okay in R -- def have a LOT to learn. This is my first time trying to create a visualization like this. I am using this guide to help me.. The very last section of my code is giving me the following error:
Error: geom_polygon requires the following missing aesthetics: x, y
I am 99% sure the x and y aesthetics are latitude and longitude from the "us" variable.
I know I am going to have to adjust the theme and add the title, and pick a gradient for the rate, etc. I just want to get something first to play around with. Thank you!
library(tidyverse) library(ggplot2) library(maps) library(mapdata) library(ggmap) drug_deaths_1999 <- drug_deaths %>% select(State, Year, Deaths, Population) %>% filter(Year == 1999, State != "United States") %>% mutate(rate = (Deaths/Population) * 100000) drug_deaths_1999$State <- tolower(drug_deaths_1999$State) drug_deaths_2016 <- drug_deaths %>% select(State, Year, Deaths, Population) %>% filter(Year == 2016, State != "United States") %>% mutate(rate = (Deaths/Population) * 100000) drug_deaths_2016$State <- tolower(drug_deaths_2016$State) states <- map_data("state") states$State <- states$region drug_deaths_1999 <- inner_join(drug_deaths_1999, states) drug_deaths_2016 <- inner_join(drug_deaths_2016, states) us <- ggplot(data = states) + geom_polygon(aes(x = long, y = lat, group = group), color = "white") + coord_fixed(1.3) + guides(fill=FALSE) ditch_the_axes <- theme( axis.text = element_blank(), axis.line = element_blank(), axis.ticks = element_blank(), panel.border = element_blank(), panel.grid = element_blank(), axis.title = element_blank() ) ## this is not working us + geom_polygon(data = drug_deaths_1999, aes(fill = rate), color = "white")+ geom_polygon(color = "black", fill = NA)+ theme_bw()+ ditch_the_axes
What programming languages are the two below articles created with? Pyton, react, R?
I think I caught a contractor lying about her location for YEARS.
Before I bark up the chain, I want to boil it down to something easy and visual for my boss to digest.
He is not one for spreadsheets and I need IMMEDIATE action taken since this contractor is in the healthcare segment. My boss literally reacts in minutes to graphs.... hours or days will go by if given a spreadsheet.
So... how do I do this?
Dataset has this: DATETIME USERNAME IP ADDRESS
IP addresses are largely the same for each facility. For example: 192.168.1.1 is always Office A, 192.168.1.13 is always Office B, with rare exceptions. (And very easily explained exceptions, as I get the contractor's logs daily.)
I was thinking a chart? But maybe an animation might do a better job telling the data story?
Any help is appreciated!
Tools I have: Ubuntu Linux, Inkscape, GIMP, Libreoffice Spreadsheet which is very close to excel. Open to downloading a program to do this, if need be.
So the dataset will be exclusively from the titles in r/Igot out. Country names and shrotenings. Denmark; DK , Neatheands and/or Holland; NL. Etc etc.
Just collect the names like the opposite of a wordcloud on a world map.
If you feel adventurous, try to add some filters and stuff. Take into account title sentences such as “ I will go to either x, y or z” where more than one country is mentioned. “US to Europe” where Europe is a region/continent. And where “from x, to y” is an indicator of where they moved from to. Include synbols such as “->” for where they moved to from.
I have sales revenue for four companies that I want to compare over time, while also comparing the growth from one period to the next (quarterly in this case). I currently have a combo chart with columns representing the revenue, and line representing the growth but it feels way to busy for quick interpretation. Any other potential ways to display this?
For instance I have two shampoos, a dry scalplotion and a night cream. I used these 70-95% correctly. I need to able to showcase this to my m.d. And dermatologist to make convincing arguments. Wether on Ipad mni ios 9.3.5 without excel and most appstore apps, o printous or if I fix the screen nex week, surface pro 3.
I have 20+ habits and I need 7 of them. Ish. Some bad some good. Like food in bed(bad), vs., applied shampoo #2(good).
No pie charts I think. Just a highschool graph with lines .
Exporting to excel gives 1column with weird rows for each entry. I have zero idea where to start in restructuring the data. I know very little html, and 2003 and 2010 vesion excel .
How do I convert exported .csv Habitbull data to nice graphs with multiple habits?
Sorry for my question context, I couldn't figure out how to word it.
Basically, every minute I am fetching what game I am playing/what channel I am in etc.
I want to display a graph that shows (from 9:00 - 10:00 he was playing League) (9:30 he was playing CSGO) etc.
I was using a stepped area graph to achieve this: https://imgur.com/a/8FSEHbH
However, with this graph, if the other 2 are false(Which they will always be) it fills in the rest of the area. I want it to remain in in its range (1 tall, X wide[depending on time online])
If anyone knows of a better way to visualize this, that would be great.