[Kaggle Study Curriculum] μΊκΈ μ€ν°λ 컀리νλΌ
π μ΅κ·Ό κ°μ₯ λ§μ κ΄μ¬μ λ°κ³ μλ λΆμΌλ AI/λ°μ΄ν°λΆμ μ§λ¬΄μΌ κ²μ΄λ€.
μμ¦ λ€μμ κΈ°μ
μμ λ°μ΄ν°λΆμ μ§λ¬΄κ° μλ‘ μκΈ°κ³ μκ±°λ,
μμ μ λΉν΄ μ§λ¬΄μ λν μ λ¬Έμ±μ λ§μ΄ μꡬνκ³ μλ€.
π©π» 21λ μλ°κΈ° μ·¨μ μ μ€λΉνλ©΄μ λλΆλΆμ λ°μ΄ν° μ§λ¬΄μμ μ½λ© ν μ€νΈ μνμ΄ μΆκ°λ κ²μ λ³Ό μ μμλ€.
κ·Έ μ€ KBκ΅λ―Όμνμ μ½λ© ν μ€νΈμ ν©κ²©μ νμλλ°, μΌλ° μ½λ©κ³Ό SQL λ¬Έμ κ° κ°κ° 2κ°μ© λμμλ€.
νμ€ν μμ μ λΉν΄ μ½λ©μ λν μ§λ¬΄μ μλμ΄ μꡬ λ¨μ λλ μ μμλ€.
νλ°κΈ° μ·¨μ μμλ μΌλ° μ½λ© ν μ€νΈ μ λ°μ΄ν° λΆμ μ½λ© ν μ€νΈ λ₯Ό μ€λΉν κ³νμ΄λ€.
π νμ¬ μ§ννκ³ μλ λ°μ΄ν° λΆμ μ€ν°λμμλ μλμ 컀리νλΌμ μ°Έκ³ νκ³ μλ€.
κ·ΈλΌ, 컀리νλΌμ λ§μΆ° μμν΄λ³΄μ.
μ€ν°λ 컀리νλΌ
πBinary classification : Tabular data
1st level. Titanic: Machine Learning from Disaster
- νμ΄νλ νν λ¦¬μΌ 1 - Exploratory data analysis, visualization, machine learning
- EDA To Prediction(DieTanic)
- Titanic Top 4% with ensemble modeling
- Introduction to Ensembling/Stacking in Python
2nd level. Porto Seguro’s Safe Driver Prediction
- Data Preparation & Exploration
- Interactive Porto Insights - A Plot.ly Tutorial
- XGBoost CV (LB .284)
- Porto Seguro Exploratory Analysis and Prediction
3rd level. Home Credit Default Risk
- Introduction: Home Credit Default Risk Competition
- Introduction to Manual Feature Engineering
- Stacking Test-Sklearn, XGBoost, CatBoost, LightGBM
- LightGBM 7th place solution
πMulti-class classification : Tabular data
1st level. Costa Rican Household Poverty Level Prediction
πBinary classification : Image classification
1st level. Statoil/C-CORE Iceberg Classifier Challenge
- Keras Model for Beginners (0.210 on LB)+EDA+R&D
- Transfer Learning with VGG-16 CNN+AUG LB 0.1712
- Submarineering.EVEN BETTER PUBLIC SCORE until now.
- Keras+TF LB 0.18
πMulti-class classification : Image classification
1st level. TensorFlow Speech Recognition Challenge
- Speech representation and data exploration
- Light-Weight CNN LB 0.74
- WavCeption V1: a 1-D Inception approach (LB 0.76)
πRegression : Tabular data
1st level. New York City Taxi Trip Duration
2nd level. Zillow Prize: Zillow’s Home Value Prediction (Zestimate)
- Simple Exploration Notebook - Zillow Prize
- Simple XGBoost Starter (~0.0655)
- Zillow EDA On Missing Values & Multicollinearity
- XGBoost, LightGBM, and OLS and NN
πObject segmentation : Deep learning
1st level. 2018 Data Science Bowl
- Teaching notebook for total imaging newbies
- Keras U-Net starter - LB 0.277
- Nuclei Overview to Submission
πNatural language processing : classification, regression
1st level. Spooky Author Identification
- Spooky NLP and Topic Modelling tutorial
- Approaching (Almost) Any NLP Problem on Kaggle
- Simple Feature Engg Notebook - Spooky Author
2nd level. Mercari Price Suggestion Challenge
- Mercari Interactive EDA + Topic Modelling
- A simple nn solution with Keras (~0.48611 PL)
- Ridge (LB 0.41943)
- LGB and FM [18th Place - 0.40604]
3rd level. Toxic Comment Classification Challenge
- [For Beginners] Tackling Toxic Using Keras
- Stop the S@#$ - Toxic Comments EDA
- Logistic regression with words and char n-grams
- Classifying multi-label comments (0.9741 lb)
πOther dataset : anomaly detection, visualization
1st level. Credit Card Fraud Detection
- In depth skewed data classif. (93% recall acc now)
- Anomaly Detection - Credit Card Fraud Analysis
- Semi-Supervised Anomaly Detection Survey
2nd level. Kaggle Machine Learning & Data Science Survey 2017
컀리νλΌ μΆμ²: https://kaggle-kr.tistory.com/32?fbclid=IwAR1k1ZZhbep5L5yngqRpglRkrap49K-2tCRYoqP1Yx-AlorfIMOOLxcIgfQ