Press J to jump to the feed. Press question mark to learn the rest of the keyboard shortcuts
Coming soon
ArchivedPinned post

I am currently studying Math&Stats - finishing 1st year this summer so I am still not very advanced. But I think Ive learnt enough math to understand some of the simpler ML concepts. I just don't know how to connect dots at this stage.

A lot of resources I found are focusing on using frameworks or trying to reduce the impact of mathematics by introducing the concepts with trust me on this.

I would like to fully understand the mathematics behind gradient descent or back propagation algorithm, for example, without someone just giving me formula and saying trust me this is how it works.

Different regression models, decision trees, probabilities ... I would rather concentrate on a pre neural network stuff at the moment.

Any medium is good: books, courses, youtube series, fb groups, meetups(London Uk) - paid or free.


I was trying not to look up too many things on the wiki but I find it impossible when it gets to the Alchemy workbench.

I was trying to make the Flavor enhancer for the Quality Fertalizer II, but I gave up and looked it up.

There was no way for me to figure out the ingredients on my own.

Do you have a system or are you hitting wiki for Alchemy?


I think this game is great. The story and characters are brilliant. Often I found my self forgetting what is my ultimate goal to simply be a graveyard keeper - I haven't immerse so much in a game for a very long time.

It is definitely worth the money.

1 point · 7 days ago · edited 7 days ago

This kind of problems takes practice the more you do them easier they become.

Usually they involve splitting the problem to smaller chunks and then using multiplication, subtraction and addition principles to stitch the answers together. I don't think there is a method other then practice to it.

I haven't watched the video yet but by looking at the plane problem (ii) you could notice that front row is different then the last row and the middle rows. If I am not mistaken front row must be taken by the business people, so this is your first part of the problem. I think the middle rows would required me to treat the married couple as a single seat, this is another problem. Last row could be counted by hand. Then merge them together.

This is where I would start and I might be wrong. I would go back check and repeat, not very helpful advice for the test, I know.

Few points:

- look out for the places where instead of counting problems that required you to calculate 'at least one, two or three' and substitute it with 'all possibilities minus "exactly one, two, three"'- it is usually easier to calculate exacts.

- if problem required you to count things that are next to each other usually is good to merge those things together, and multiply by merged permutation/combination arrangement, and see if the problem gets simpler.

- there are usually multiple ways to solve a problem

Simplify, practice, repeat.

I am using to solve one/two combinatorics problems a day and I am far far away from being an expert. At the moment I find the plane problem hard and definitely worth more then 4 points.

EDIT: I watched the video and I misread the question, sorry. I thought there is only one couple and they are allowed to sit across the aisle. So my description to solve this particular problem isn't correct but the process of slicing the problem to smaller problem still stands. Just to add two more points:

- read question carefully ;)

- draw simplified diagram to get more insights.

Original Poster1 point · 7 days ago

I guess there isn't a good shortcut. The frustrating thing about these questions is that I feel like I haven't learned from a question. In other topics it takes like 3 questions to get the ghist or technique.

see more
1 point · 6 days ago · edited 6 days ago

I know it can get frustrating because I felt the same at the beginning. I never arrive at the right answer. But trust me it gets better with practice. Combinatorics problems are like puzzles. There is no shortcut but with enough practice you will start spotting the patterns. Most of the test problems are designed so they look obscure till you will spot something and then they become easy. Take your time. Do your examples, analyse the answer, make sure you understand what was a "key point" to solve it. Repeat. In a very short time you will start noticing that those "key points" are showing up over and over again in different problems. As I said before most of the time it will be split problem to smaller cases then use permutation/combination and then multiply/add them together. Good luck!

2 points · 11 days ago

What is it bouncing off? Been watching it for 5 minutes now. Is it a dark matter everyone is talking about?

3 points · 12 days ago

You can watch OCW Linear Algebra course which is based on Introduction to Linear Algebra. Ive got both books. I bought Linear algebra done right first but then after watching some of the OCW Linear Algebra I switched to Introduction to Linear Algebra. Introduction to Linear Algebra is very detailed with tons of examples, if you have time and planing to really learn linear algebra I would spend more time with Professor Strang.

1 point · 22 days ago · edited 22 days ago

We have (AI)x=0

Lets say B = A - λI

So B x = 0

Multiplying both sides by the B-1 gives

B-1 B x = B-1 0

Since B-1 B = I, so

x = 0

Which doesn't make any sense because we know that x is an eigenvector of A. That means B is not invertible after all so its determinant must be 0.

B = A - λI

You know that det B must be zero so

(a - λ)(d - λ) - bc = λ2 - (a + d)λ + ad - bc = 0

We have (a + d) = tr A, ad - bc = det A

Hence your eigenvalues are solution to a quadratic equation

λ2 - tr Aλ + det A = 0

Just to add a bit to the discussion - maybe not answering your question directly but Linear Algebra is the most used field(branch?) in mathematics - I am not 100% sure how to call it.

Most of the other branches are using Linear Algebra in one form or another.

Very informal way on explaining why this is:

1) It is fairly easy with some simplifications and assumptions to convert almost any problem to be a some sort of Linear Algebra problem.

2) When you succeed in a process of transferring your problem into a Linear Algebra problem. You will gain over 200 years of mathematical theorems for free. Things like dot products, eigenvalues/vectors and so on will be able to tell you very interesting things about your system.

3) Computers are stupidly fast at performing matrix operations. Plus Solving Linear Algebra problems usually results in a less amount of calculation resulting in less errors and rounding errors. So convert your problem to Linear Algebra problem and you will get your results more accurate and faster.

As I said my description is very abstract(haha) and incomplete. So please be gentle. I remember when I discovered for a first time how widely it is used I was mindblown.

15 points · 25 days ago

This is mostly interesting from a technical point of view, not much actually happens. So if we look for example at all functions f:R->R, these can form a real vector space using the definition you mentioned. To speak of a vector space we need to define scalar multiplication and addition on the elements of said space, so in this case functions. That means for the addition we need to have a well defined function

h = f + g

if both f and g are functions. And here the definition comes into play: The new function is defined as

h(x) := (f+g)(x) := f(x) + g(x)

So we use the addition defined on the real numbers to define an addition on functions in our new function space.

see more
7 points · 25 days ago

30 minutes ago I had a revelation of finally understanding what a vector space really is and what actually is a vector in this context. Random sequence of thoughts that happen while I was working - out of nowhere. I wouldn't understand what do you mean 1h ago I fully understand it now. I love Math.

There is no theory of evolution. Just a list of animals Chuck Norris allows to live.

2 points · 28 days ago

Very nice. I never thought of making rivers! That should be added to the game, or has it been? I haven't play for a while!

Original Poster1 point · 28 days ago

i wanted to supply my base with water so i created river with (mod CanalBuilder) ! it let me travel accros shallow water. i will expend river style base like Venisa ahah

see more
3 points · 28 days ago

Nice. I will add CanalBuilder to the list of mods for my next play through. Thank you.

Python is a go to language for ML not because it is a python but because of all the libraries that are supporting it. A lot of that stuff is implemented in C so most of the time you are as fast as C other libraries are not implemented in C so they are slower.

Do you want to implement all of the ML algorithms from scratch to use them yourself? Good luck. Let me know when you implement SVM, Random Forest boosting bagging and other stuff I am looking forward to take a have a look.

You want to use python because off all the libraries and pipelines it can provide. Sklearn, pandas, numpy, tensorflow, matplotlib just to name a few so I would plan little bit more time then week or two to get you up to the speed...

Starcraft 1 is free. Check it out.

see more
2 points · 1 month ago

Starcraft 2 is free as well.


I have found CS188.1x at edX. It has a lot of positive reviews. Although it ended a while ago I am sure it is still relevant.

Are there any other worthwhile alternatives that are currently active?

Please note that I am not necessarily looking for yet another ML, NN course focusing on data science. I am looking for an AI related one - agents, bots, searches... Preferably with a hands-on approach.


There are some courses listed (spread out a bit) in the Getting Started section of our wiki. Udacity has a bunch of intro to AI courses and EdX has a new(er) one from Columbia. I haven't taken that one, but the Berkeley one was great. I don't know how much worse it would be to just take the "finished" course now (I guess there won't be much discussion on the forum anymore, but other than that...). If you're looking for something a bit different, you may also like MIT's AGI course.

see more
Original Poster1 point · 1 month ago

Do you think there is not that much of a difference between introductory courses?

The main reason for me to find an active course is the engagement. Deadlines and forum chats usually keep me more focus.

I am going to check your links. Thank you.

Do you think there is not that much of a difference between introductory courses?

There are differences, but I don't really know about all of them. The main comparison is probably between the ColumbiaX course and the Udacity courses, because MIT's AGI course is very different (not really a modern MOOC; just a bunch of videos from different guest lecturers; but it gives an intro in something a lot of people are quite interested in again).

I have taken some of the Udacity courses when they first started, which is more than 5 years ago now. What I like(d) about them was that they have multiple AI courses where the first intro course is probably the easiest and you can then take the others which are a bit more difficult. I don't know exactly what Udacity and these courses are like now, but I'm a bit like you and when they switched to the "self-paced" model I basically stopped doing their courses. I have no idea how busy the forums are currently.

I haven't taken ColumbiaX's course, so I can't say much about it. It seems to have replaced BerkeleyX's course which you linked, so perhaps it's similar? I liked that one better than Udacity's AI courses, but I don't know how representative my experience is since I took them for fun after I already had bachelor's and master's degrees in AI. (And also, I don't really know if ColumbiaX's course is similar.) What I do know is that it doesn't seem to be self-paced, so that's good, but the current iteration started last month, so that's too bad.

see more
Original Poster1 point · 1 month ago

That is a lot of info. Cheers. I guess I will wait for a next run of ColumbiaX. I just got Learning with Kernels, it should keep me occupied for couple of months.

Load more comments

2 points · 1 month ago

My suggestion for the article would be to bring it full circle. Show some inputs and classes. You kind of geeked out on the CV but didn't say what the grid search returned and then showed that applied to the original problem.

see more
Original Poster1 point · 1 month ago

Thank you for taking time in reading it! I will see what I can do. The feedback I got so far is about people not being sure who is a target audience - and I agree. I will think more about it for a future.

2 points · 1 month ago

Plastic surgeries are getting out of hand ...

Eve online and then do planetary production or manufacturing

see more
Original Poster1 point · 1 month ago

That would be perfect. I have been doing this for a long time a while ago. I just don't want to go back to Eve atm. Good suggestion tho.

f is a function of x defined as f(x) = 5x2 - 4

so f is mapping any x to some new value described by the rule 5x2 - 4

so if you need to find out what is a value for a specific x - in your case -6 - you can write it as f(-6)

and to find out what is the value of -6 under f, substitute -6 in a polynomial

so f(-6) = 5*(-6)2 - 4 = 176

7 points · 1 month ago

There is no eyes or any other features.

There is only a multidimensional vector space, like 100, 1000, 1M dimensions ... and keep in mind we live in a 3d world.

There is a mathematical model, simply a mathematical function, that tries to fit the line, plane, hyperplane to that multidimensional space to for example optimaly separate A from B.

No magic. Some rules about what is A and what is B. Pure math.

You can build specialised function and call it eye detector but at the bottom of it it will be a simple function, fitting hyperplanes, manipulating matrices. Eye is abstract concept.

Neural Networks are often referred to as black boxes - you don't know why one image was classified as A or B - aka you don't know if the model found eye, specific color or something totally different to classify it as A for example.

I am not an expert but this is how I see this.

Original Poster1 point · 1 month ago

Yes, I'm with you so far, and everything you've said matches what I understand from the various tutorials I've read.

Forgive me for struggling to articulate the question. Where I get lost is that the number of possible mathematical functions that could be applied to sets of images is beyond ridiculously large, and only a small subset would have any relevance to the sorting problem at hand. I'm confused as to how ML programs can winnow down that pool of possible functions to a manageable number to test via training without having some human direction on what to look for, what features exist, etc.

see more
3 points · 1 month ago

One thing that might be helpful is to understand that you are not trying to winnow down that pool of possible functions but you are trying to find best parameters for a single function. And that function can have 1M parameters - number of your features. And there will be only one set of best parameters that would separate A from B.

For that there are directions. Rules of what is the best. Most of those models, aka functions, are design so they are trying to minimise something.

For example minimise root mean square error between features. Other Hinge are trying to find maximum-margin between features. They are commonly known as a loss functions.

video that helped me to understand square error

Don't worry about having problem with articulation I have it all the time specially when I am missing that one thing. Hope I understood it correctly and I am on right track explaining it.

Walking a lot. I used to walk first to work and then from work - not the whole journey of course but some bits of it. Then I started to walk every other evening. I lost 12 kilos in 6 months - maybe nothing spectacular but I am keep on going.

1 point · 1 month ago · edited 1 month ago

Not Physics. Machine Learning relies heavily on those germans. As a way to reduce the dimensionality of your data. They are key to SVD and PCA methods.

1 point · 1 month ago · edited 1 month ago

I have setup simple experiment to check importance of the multi core CPU while running sklearn GridSearchCV with KNeighborsClassifier. The results I got are surprising to me and I wonder if I misunderstood the benefits of multi cores or maybe I haven't done it right.

First of all I have loaded MNIST and used 0.05 test size for 3000 digits in a X_play.

from sklearn.datasets import fetch_mldata
from sklearn.model_selection import train_test_split

mnist = fetch_mldata('MNIST original')

X, y = mnist["data"], mnist['target']

X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]
_, X_play, _, y_play = train_test_split(X_train, y_train, test_size=0.05, random_state=42, stratify=y_train, shuffle=True)

In the next cell I have setup KNN and a GridSearchCV

from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import GridSearchCV

knn_clf = KNeighborsClassifier()
param_grid = [{'weights': ["uniform", "distance"], 'n_neighbors': [3, 4, 5]}]

Then I setup 8 cells for the number of jobs. My CPU is i7-4770 with 4 cores 8 threads.

grid_search = GridSearchCV(knn_clf, param_grid, cv=3, verbose=3, n_jobs=N_JOB_1_TO_8), y_play)

n_jobs = 1

Wall time: 1min 57s

Verbose time: Parallel(n_jobs=1)]: Done 18 out of 18 | elapsed: 2.0min finished

n_jobs = 2

Wall time: 1min 25s

Verbose time: Parallel(n_jobs=2)]: Done 18 out of 18 | elapsed: 1.4min finished

n_jobs = 3

Wall time: 1min 20s

Verbose time: Parallel(n_jobs=3)]: Done 18 out of 18 | elapsed: 1.3min finished

n_jobs = 4

Wall time: 1min 20s

Verbose time: Parallel(n_jobs=4)]: Done 18 out of 18 | elapsed: 1.3min finished

n_jobs = 5

Wall time: 1min 25s

Verbose time: Parallel(n_jobs=5)]: Done 18 out of 18 | elapsed: 1.4min finished

n_jobs = 6

Wall time: 1min 25s

Verbose time: Parallel(n_jobs=6)]: Done 18 out of 18 | elapsed: 1.4min finished

n_jobs = 7

Wall time: 1min 25s

Verbose time: Parallel(n_jobs=7)]: Done 18 out of 18 | elapsed: 1.4min finished

n_jobs = 8

Wall time: 1min 24s

Verbose time: Parallel(n_jobs=8)]: Done 18 out of 18 | elapsed: 1.4min finished

So there is no difference in a speed between 2-8 jobs. How come ? I have noticed the difference on a CPU Performance tab. While the first cell was running CPU was used ~13% and it was gradually increasing to 100% for the last cell. I was expecting it to finish faster. Maybe not linearly faster aka 8 cores would be 2 times faster then 4 cores but a bit faster.

I am running it atm for test_size=0.2 which is 12000 digits but this will take a while. I will try different classifier as well but I wanted to check with you first. There must be something happening here I am not aware off.

Cake day
December 30, 2014
Trophy Case (2)
Three-Year Club

Verified Email

Cookies help us deliver our Services. By using our Services or clicking I agree, you agree to our use of cookies. Learn More.