147
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 15 Aug 2023
147 points (78.6% liked)
Technology
60130 readers
2755 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 2 years ago
MODERATORS
Hmm... Nothing off the top of my head right now. I checked out the Wikipedia page for Deep Learning and it's not bad, but quite a bit of technical info and jumping around the timeline, though it does go all the way back to the 1920's with it's history as jumping off points. Most of what I know came from grad school and having researched creative AI around 2015-2019, and being a bit obsessed with it growing up before and during my undergrad.
If I were to pitch some key notes, the page details lots of the cool networks that dominated in the 60's-2000's, but it's worth noting that there were lots of competing models besides neural nets at the time. Then 2011, two things happened at right about the same time: The ReLU (a simple way to help preserve data through many layers, increasing complexity) which, while established in the 60's, only swept everything for deep learning in 2011, and majorly, Nvidia's cheap graphics cards with parallel processing and CUDA that were found to majorly boost efficiency of running networks.
I found a few links with some cool perspectives: Nvidia post with some technical details
Solid and simplified timeline with lots of great details
It does exclude a few of the big popular culture events, like Watson on Jeopardy in 2011. To me it's fascinating because Watson's architecture was an absolute mess by today's standards, over 100 different algorithms working in conjunction, mixing tons of techniques together to get a pretty specifically tuned question and answer machine. It took 2880 CPU cores to run, and it could win about 70% of the time at Jeopardy. Compare that to today's GPT, which while ChatGPT requires way more massive amounts of processing power to run, have an otherwise elegant structure and I can run awfully competent ones on a $400 graphics card. I was actually in a gap year waiting to go to my undergrad to study AI and robotics during the Watson craze, so seeing it and then seeing the 2012 big bang was wild.