my subscriptions
POPULAR-ALL-RANDOM | LOADING...MORE »
bms111 2 points

I don't really think that this is a fair representation. Considering the baby boom, in order for the 2000 data to match the 1900 data, the generations after it would have been off the charts.

SLPeoples 1 point

If you look at the dashboard link, hovering over the graphs will reveal each point's information. It's disorienting because the percentages are with respect to their year's total population, and the population is much larger now than it was in 1900. By representing things as "percent of total" we can see the stark differences in our demography.

KerPop42 1 point

Gotta love that baby boom

SLPeoples 3 points

This visualization is more about the lack of children being born between 1970 and 2000, which is what the seemingly missing chunk is displaying; this is also notable through the shift of the median age, which is by definition an "aging population". The baby boomers aren't very well represented in this visualization.

Load more comments
SLPeoples commented on a post in r/dataisbeautiful
FogleMonster 3 points

For this visualization, I used an NES emulator (of my own creation) to record a snapshot of the Nintendo's RAM at each frame (60 fps) for 5 seconds. The NES had just 2048 bytes of RAM! For each address in memory, I plotted its values (ranging from 0 to 255) over time as an individual sparkline. I only included addresses that changed at least once, so there are fewer than 2048 sparklines. Furthermore, I individually scaled each sparkline so as to utilize the full vertical range. Because each game developer used the memory in different ways, each game produces its own unique look when plotted in this way. But I'm fond of the Donkey Kong plot - so here it is!

The emulator (written in Go): https://github.com/fogleman/nes

The rendering code (meant for pen plotting): https://github.com/fogleman/axi

And finally, here's a video of it being drawn with an AxiDraw pen plotter: https://www.youtube.com/watch?v=VxXVcmPwQrM

SLPeoples 1 point
FogleMonster 2 points

I have that book. :)

In fact, Edward Tufte saw my post about this on Twitter and he loved it! I was so happy. (Didn't realize he was even on Twitter until that moment!)

https://twitter.com/EdwardTufte/status/954537749234765825

SLPeoples 3 points

The visualization really reminded me of his style, and I figured you would like it. I'm excited by the response he gave you! Great job.

SLPeoples commented on a post in r/dataisbeautiful
Revlong57 -2 points

Go back to r/the_cheeto with that mess.

SLPeoples 0 points

Rule 8.

Posts regarding American Politics, or contentious topics in American media, are only permissible on Thursdays (ET).

It's Thursday.

Revlong57 -2 points

Ok, my problem with your comment wasn't that it's political, my issue is that it's super racist, and not even remotely true.

SLPeoples 1 point

I think you should reevaluate that claim, and perhaps return with substantial evidence, because I don't believe anything I have said is hateful in any way. Mexico is not a race of people; it is a nation.

Load more comments
SLPeoples commented on a post in r/The_Donald
garbagetime01 18 points

Silenced lips hiding proof of gross abuse of government surveillance to influence an election leads to more corruption.

SLPeoples 1 point

But you don't fight corruption with lawlessness; there are proper channels through which justice should be done. I agree that the people have a right to know, but that doesn't negate the existence (or necessity) of the law.

garbagetime01 2 points

This would be PRIME whistleblower territory.

SLPeoples 2 points

I wholeheartedly agree.

Load more comments
SLPeoples commented on a post in r/MLQuestions
SLPeoples 2 points

This is a limited opinion, but I found that my networks (typically keras and TF backend) run significantly slower on Windows 10 with TF GPU, rather than through CPU. I'm using an i7-6700 and a GTX1080, if that changes anything.

rajicon17 1 point

Do you mean your GPU is slower than your CPU, or you GPU is slower on Windows compared to Ubuntu?

SLPeoples 1 point

I mean that tensorflow ran slower when I enabled GPU settings on Windows, and I did not notice that problem on an Ubuntu system.

The reason I asserted that it was a limited opinion was because it could have had something to do with the specific networks I was working with (I would consider myself a novice), so I don't want to make generalizations.

SLPeoples commented on a post in r/rstats
DangerousPie 12 points

You could try using the country flag unicode characters: https://emojipedia.org/flags/

Not sure if that will work, but worth a shot!

SLPeoples 8 points
SLPeoples 1 point

Using the WordCloud package in python, I used a simple script to parse the entirety of the text from Fight Club by Chuck Palahniuk. I used a stencil to overlay the word cloud.

Data: Fight Club Text

Tools: Python

keithwaits 1 point

How come FI "Tyler" occurs multiple times?

SLPeoples 2 points

The WordCloud package tries to optimize the proportionality of the words used, while also remaining in the confines of the defined mask, so I could see how it might separate some values which would take up too much space otherwise.

SLPeoples 2 points

I used a recurrent neural network to read through excerpts of Trump's speeches over the last couple years and trained it to create blocks of text similar to his speaking style. I am in the process of refining the model, but figured you guys would like the word cloud I created in the process.

FreeBased1 2 points

Dude if you can do that, you've got major talent that can help MAGA!

Very cool!

SLPeoples 1 point

Thanks man. I appreciate the kind words.

SchruteCapital 1 point

Is there any way you’d be able to upload this data set? My professor would love this as a teaching example for clustering.

SLPeoples 1 point
-RiskManagement- -3 points

Elbow Method? I could just count to 5

SLPeoples 1 point

Alright, so if you have a dataset like this, where you have a lot of categorical features:

https://i.imgur.com/p0cH5Xa.png

It's pretty hard to find a way to visualize their relationships, since the most we can understand is three dimensions.

So I went ahead and took this dataset, which was classified here, and ran it through the same script described above.

The great thing about the elbow method is that I can now choose an optimal number of clusterings, instead of doing it visually, since that's out of the question.

https://i.imgur.com/ADIA2RM.png

Here, you could choose two because it's not very smooth, or you could see six as a good place to cut off, I'll show you what the visualization reveals for both.

  • For 2 clusters

https://i.imgur.com/srT4VHG.png

https://i.imgur.com/IqhoWVr.png

  • For 6 clusters

https://i.imgur.com/0EX8sW6.png

As you can clearly see, the visual representations are nearly arbitrary since we can't see everything in two dimensions. For the six-cluster, you can't even see three of the clusterings. In this instance, the elbow method let's you choose your optimal clusterings, which was a large part of the exercise (it turns out, two is the optimal clustering, since X is the dataset above, without the column for Poisonous (P) or Edible (E), and the distinction between the two categories is fairly distinct). This example was just a quick scratch application, but hopefully provides a little more insight into the versatility of the seemingly trivial visualization that was posted.

Here's a couple wiki links as well

https://en.wikipedia.org/wiki/Elbow_method_(clustering)

https://en.wikipedia.org/wiki/K-means_clustering

Load more comments
SLPeoples commented on a post in r/MachineLearning
kayaking_is_fun 1 point

By generative modeling I’m referring to Bayesian modeling and probabilistic approaches, not GANs - these can provide PDFs over outputs rather than just answers and will tend to have low certainty in areas with out of sample data

SLPeoples 2 points

Thanks for the clarification. That makes more sense now.

kayaking_is_fun 3 points

Does anyone else feel like these adversarial attacks will eventually lead to a more robust solution than neural networks becoming the norm for any kind of security conscious problem? Or is this already the case?

NNs have such unpredictable behaviour when presented with out-of-sample data that the generative modelling approach looks much nicer.

SLPeoples 2 points

Aren't GANs just NNs with a discriminator though?

SLPeoples commented on a post in r/MapPorn
sniper989 2 points

I'm quite sure that asylum seekers aren't counted as immigrants also, although I could be mistaken

SLPeoples 1 point

Just look at this visualization from The Economist and you'll see the similarities I'm describing. You're not wrong in saying that "All of the immigration shown in the OP cannot be attributed to asylum seekers", but I would challenge you in saying that a majority of the datapoints in this visualization contain asylum seekers.

The shading in Europe, depicting specifically asylum seekers, is remarkably similar to the OP. The conclusion that can be drawn is that the shading suggests that those countries with a darker hue have a more welcoming immigration policy, and are thus the countries which perhaps have taken more refugees; so to draw a similar conclusion from the remarkably similar visualization posted is not fallacious.

Whether one takes this as a positive, or negative fact is entirely based on political leaning, but the conclusions are absolutely not "ludicrous".

https://cdn.static-economist.com/sites/default/fileshttps://www.reddit.com/images/2016/05/blogs/graphic-detail/20160528_srm976.png

sniper989 1 point

Again, asylum seekers aren't even included in the stats iirc. And yeah countries with advanced economies and large populations get lots of immigrants -- I don't need a map to tell me that

SLPeoples 1 point

I fear that you're just arguing instead of having a discussion, so I'm going to go ahead and stop responding to you now. You are incredibly unreasonable.

I would suggest reading some more about this topic before making nonsensical arguments in the future. Here's but one resource.

https://www.economist.com/blogs/economist-explains/2015/09/economist-explains-4

Load more comments
drivenbydata 2 points

nice, I like how the data grouping etc is explained in the slides.

one thing that always puzzles me is why we call companies with hundreds of employees and several millions in revenue "startups".

I'd love to see a similar analysis for companies with less than 10 employees and less than 1 million in revenue. like, real startups.

SLPeoples 2 points

It's weird how industries define things in a completely different manner than one would expect, I agree. I think we're basing "Startup" on the following:

A startup company (startup or start-up) is an entrepreneurial venture which is typically a newly emerged, fast-growing business that aims to meet a marketplace need by developing a viable business model around an innovative product, service, process or a platform.

A great feature of Tableau is that you can directly download the workbook and just change the data source, and as long as your features are labeled the same way (of which you can rename/ change the alias within the program), you will get the same visualization immediately (if not after a few minor edits)!

SLPeoples 1 point

Interactive dashboard created with Tableau. User can control their thresholds for number of startups they're interested in, as well as cutoffs for Expenses and Revenue. Locations are visualized on a map to allow for geographic-based decisions.

Blog Post: https://slpeoples.blogspot.com/2018/01/blog-post.html

Interactive Dashboard: https://public.tableau.com/profile/samuel.l.peoples#!/vizhome/1000Startups_58/TheStartupQuadrant

Data: https://github.com/SLPeoples/Advanced-Tableau-DS/blob/master/03-Advanced-Table-Calculations/CourseContent/Coal-Terminal.xlsx

Feedback is welcomed and appreciated!

SLPeoples commented on a post in r/MapPorn
SLPeoples -5 points

I think your data may be incorrect, because Japan is an ally of the United States. https://i.imgur.com/ys3Y5x3.png

On what metric are the different classifications made?

anonymous-t- 34 points

Yes, of course. But this is the opinion of the American people as a whole, not the government.

SLPeoples 11 points

Oh I see! Thanks for clarifying this.

SLPeoples 3 points

Exercise in optimizing cryptocurrency portfolios using a Monte-Carlo algorithm, based on John Geenty's article (https://medium.com/@geenty/optimizing-your-cryptocurrency-portfolio-with-python-4c3d4c824a7f). Uses sharpe ratios and determines a low volatility, high returns portfolio between BTC, ETH, ETC, LTC, DASH, NEO, ZEC, XMR. Visualization was taken to Tableau for a closer look at the different portfolios and backtesting.

Blog Post: https://slpeoples.blogspot.com/2018/01/optimizing-digital-currencies.html

Interactive Dashboard (With allocations): https://public.tableau.com/profile/samuel.l.peoples#!/vizhome/CryptocurrencyOptimization/50000PortfoliosBTCETCETHDASHLTCZECXMRSharpeRatios

Script: https://github.com/SLPeoples/Python-Excercises/blob/master/CryptocurrencyOptimization/pullData2.py

Data: https://github.com/SLPeoples/Python-Excercises/tree/master/CryptocurrencyOptimization/2018-01-10

Feedback is welcomed and appreciated!

SLPeoples 1 point

OC developed in Tableau. Graph points were formatted in eight-hour buckets, with vertical axes representing idle-capacity. If a machine did not move 90% or more of their capacity for coal across an eight our period, it is quite easy to tell. Any machines displaying an idle capacity over 10% during an eight hour period is flagged for repairs, as well as any machine trending towards an idle capacity of that magnitude.

Storyboard: https://public.tableau.com/views/CoalTerminalMaintenance_2/CoalTerminalMaintenanceAnalysis?:embed=y&:display_count=yes

Blog post: https://slpeoples.blogspot.com/2018/01/coal-terminal-maintenance-analysis.html

Data: https://github.com/SLPeoples/Advanced-Tableau-DS/tree/master/03-Advanced-Table-Calculations/CourseContent

Feedback is welcomed and appreciated!

SLPeoples 1 point

It's taking me a minute to figure out what links the community doesn't like; so far I've realized direct posts to blogs and Tableau aren't taken too well (which is pretty understandable for those on mobile)! I've reuploaded as a static image, but the details of this busy visualization are below!

Twitter Sentiment Analysis using Tableau and R. Tweets are scraped within a two-week period using R, and are then saved to output files for analysis in Tableau. The dashboard was developed to provide insights to the popularity and sentiment of tweets about the Apple iPhone and Samsung Galaxy. Sentiment is scored based on two lexicons, one for "good words" and one for "bad words".

Blog Post: https://slpeoples.blogspot.com/2018/01/sentiment-analysis-of-twitter-users.html

Interactive Dashboard: https://public.tableau.com/profile/samuel.l.peoples#!/vizhome/TwitterSentimentAnalysis_4/SentimentAnalysis

Data: https://github.com/SLPeoples/Text-Mining-Sentiment-Analysis/tree/master/02_DataHarvesting/CourseContent/scrapedDatasets

Script: https://github.com/SLPeoples/Text-Mining-Sentiment-Analysis/blob/master/02_DataHarvesting/CourseContent/dataHarvesting.r

Feedback is welcomed and appreciated!

binfandstuff 12 points

this is a really underrated viz. I like this a lot but I would maybe throw in: 1) A legend for the bubble size 2) A bar graph or something that also dynamically changes to visualize the trend among regions more clearly.

SLPeoples 5 points

It's definitely something that would be best suited for a presentation, while a more detailed visualization is required for it to stand on its own. Thanks for the suggestion!

Either way, I'm certain you can pick out which bubbles belong to China, Japan, India, and the United States without the legend!

abscae 5 points

This is brilliant...any hopes for someone who has never coded to get to that level if I teach myself?

SLPeoples 8 points

I taught myself Tableau using a few $10 courses on Udemy over the last few months. It definitely makes the visualization process enjoyable and intuitive.

Tableau Public is free, and they have a handful of tutorials that will get you running. The coding is quite easy to learn, and is minimal (about the same extent, if not less than excel); while a majority of the work is done on the backend. Just grab some datasets and see what you can make!

Load more comments
469 Karma
421 Post Karma
48 Comment Karma

Veteran. Pragmatist. Analyst.

Following this user will show all the posts they make to their profile on your front page.

About slpeoples

  • Reddit Birthday

    January 13, 2018

Other Interesting Profiles

    Want to make posts on your
    own profile?

    Sign up to test the Reddit post to profile beta.

    Sign up