Press J to jump to the feed. Press question mark to learn the rest of the keyboard shortcuts
Coming soon

I cant believe I watched long enough to get to that punch line, but it was worth it

Sklearn typically uses a Classification and Regression Tree (CART) type algorithm for tree based classifiers so it is necessary to one-hot encode categorical variables because they do not meet assumptions of cardinality implication in regression - ie, an ID of 1 is not twice an ID of 2. The upside of CART algorithms is much more flexible handling of numerical data - it can actually branch based on an interval. For example if A > 5 and B is true then this record belongs in class C.

see more

Hi, I don't want to start a new thread for this but I have a related question. Is there need for any special treatment for a discrete numerical feature, such as 'number of legs' in predicting animal class.

Original Poster0 points · 4 months ago

For the simple one off tasks Excel is a lot quicker and easier

see more

Mmm, I disagree, but acknowledge that it's a matter of personal opinion.

How so? Excel is way better at this IMO. If I need to do something quick it can often be better to use Excel or command line. There's a little more overhead to getting a notebook going.

see more

Overhead difference is negligible. The biggest thing is in excel you do two or three filters/sorts/pivots whatever and there is no history of it. You have to remember the operations you did, and when you're returning to a sheet weeks later that can be confusing. With pandas it's all right there in front of you.

Building on that point, because everything is on record, and because you have to deliberately type out operations, it creates a work flow that is more purposeful and less just looking at random slices of data.

I use excel for looking at data, but I'm not building any metrics or charts in it. It's reasonable to use for small stuff, but anything that I want to revisit I use pandas.

Load more comments

Hope you have at least one of the following in a primary location on your resume ['phd', 'stanford', '5 years', 'acquired']

Hi I put a beautifuly designed computer on a desk, give me up votes!


Almost every Data Science interview I've had has had a case study portion. The case is representative of a business problem that a Data Scientist would face. I'm looking for a resource with practice cases that match the above description and example solutions. I am also interested if you have alternate methods for preparing for this type of interview.

Ex: We want to implement a new feature, how would you assess it's viability? What data would you want to see? What is your recommendation?


These are typically questions you can answer using your own experience. If you are just starting it can indeed be challenging.

I don't know if such detailed use cases are available but a good approach would be to browse some of the Kaggle datasets.

Some of these datasets are from tech companies and you can either find some interesting questions to be answered (some are even listed) or you can find some kernels answering the questions (code and comments are available).

Examples of datasets:

Hope this helps.

see more
Original Poster1 point · 5 months ago

Appreciate the advice. I'm not just starting, but I find that my experience doesn't always give me the best approach for companies in different business sectors. I will check out the kernels, but still looking for a more turn key approach.


Bumper, passenger side: front fender, door, area above the door, and rear fender are damaged and need to be replaced. I am a novice, but from what I have seen on youtube replacing front fender, rear fender and passenger door are doable projects. I plan to buy OEM parts and have a body shop paint the parts for me. Then remove the old and install the new, avoiding autobody markups.

Assuming the replacements go well, is there anything that might tip the dealership off that I did the repairs myself?

edit: removed unneccessary information

6 points · 6 months ago


The area above the door is part of the roof and that's structural, as is a rear quarter panel on a unibody car. Doors aren't always as easy to replace as you'd think. Fenders? Bumper skin? Yes.

And then you need to paint and have a good color match. This is all work that is way over the head of someone without experience doing body work, and even at a body shop doing structural work is a job for a specialist.

see more
Original Poster0 points · 6 months ago

Thanks for the feedback

I'm struggling to figure out why you are doing these repairs yourself.

see more
Original Poster0 points · 6 months ago

to save money


The assignment consists of 3 csv's and a pdf with 10 questions. The questions are mostly SQL related, but they mention that including visualizations is encouraged if they will help with conveying the results.

I'm currently working in a Jupyter notebook, using standard python libraries and psycopg2. Wondering if psycopg2 is the best library for sql queries to a postgres database? And if Jupyter is a good way to go about presenting my work?



Sounds eerily similar to a position I'm interviewing for. Any chance you're in Socal?

see more
Original Poster1 point · 6 months ago

How are you going to present your answers?

Original Poster3 points · 6 months ago

SF. All the analyst/data science roles I've applied to have been heavy focus on sql.. wasn't expecting that, but I it makes sense

Load more comments

Disney marketing - wet dream

That house is a stack of value adding features, more than it is well designed. Also they took out all the trees : /

Our date nights are always the same and we really look forward to them every time. We go to the local Market District (fancy grocery store) and get the same sushi rolls as always and pot stickers, we get dessert and I always get 2 macarons (usually pistachio and vanilla) and he gets whatever has raspberries. Then we take it back home and I make jack and cokes for each of us. Then we spend the rest of the night eating and drinking and playing co-op video games. Right now we’re playing the shit out of Super Mario Odessey that I got him for Christmas.

see more

Shout out to market district 412!

Yeah. I think it's because people spend more around the holidays, so they spend a little extra on postmates when they normally wouldn't get stuff delivered. Since they spent extra to get stuff delivered rather than picking it up themselves they save some by not tipping.

see more

Over thinking it my guy. Holidays means more spending, if we're seeing a decrease in tips it means something is happening between the customer and us (i.e. the app/company).

Thanks OP literally looked up this sub to ask this same question.

I just started up again after a year. Tip rate on the 20 deliveries I've done over the last two days in Palo Alto is 5% (1/20) vs 70% at the beginning of 2017. Hoping that they will process and come through! At 13/hr before gas and insurance I don't have the money to test the app out and see if they changed the tipping process.


Hello, this is a classification problem with limited opportunity for feature engineering. I'm using a random RandomForestClassifier, and GridsearchCV to tune the hyper parameters. I have two questions 1. Have I implemented the random forest in the best possible way (will share my notebook) 2. Where should I focus the rest of my efforts, try more classifiers, or do more with the random forest.

Down to chat wherever.


where is the part where they painted his lil tootsie roll green?


I'm going through a bunch of tutorials on matplotlib, pandas, numpy, sklearn etc., and a lot of them are based on Jupyter notebooks, which seem to be pretty cool. But for me, following along, I want to write my code in a file so that it's easy to review. While writing the shell scripts in a file I I have run into some small, but disruptive, diferences. Should I start gettting used to doing data science in the shell? Or should I continue to find work arounds?


i see that reddit t

-2 points · 1 year ago

Must be nice to hear waste water flowing down those drain pipes each time someone takes a shitter, relaxing with a brandy in those luxury white armchairs :)

see more
Original Poster9 points · 1 year ago

conversation starter

Maybe I'm just an uncultured ignoramos but at least from that point of view it just looks like some drapes and some furniture pushed up against the walls to me. I'd honestly expect more from a world renowned architect.

see more
Original Poster17 points · 1 year ago

There is a shred of truth to your interpretation. Possibly when you decide to repurpose a cement factory you concede that some details might get overlooked.

Personally I like the sheer size of the space, the natural light, the comfortable fabrics. I could imagine myself sprawled on the floor, chilling on the side couch, sitting on the stairs, wrapped in the drapes. Iono, appreciate the comment tho.

The dog doesn't know what's going on.. he's like an unwitting person that produces amazing results without knowing how, which in and of itself breeds more amazing results... Wish I could think of a real world example of someone who is accidentally caught in a whirlwind of success

Went straight from the tonight show to r/celebs to see her.. OP you delivered.

snooze01 commented on

Considering the fact that they just lost to the Baltimore Ravens (Very comparable to the Browns) why the fuck would you do that

see more

Baltimore was a fluke (Tomlin fucked it).. they got complacent and came out with a weird game plan (run a lot), didn't let Ben get a rhythm. They're going to come out firing against the Browns, expect 35+ points.

Cake day
April 19, 2012
Trophy Case (1)
Six-Year Club

Cookies help us deliver our Services. By using our Services or clicking I agree, you agree to our use of cookies. Learn More.