본문 바로가기

카테고리 없음

Types of Data and their References

1. Tabular Data

Machine Learning Whole Easy Understanding

https://amazingagenda.tistory.com/2

 

Spaceship Titanic Code with XgBoost Classifier

https://www.kaggle.com/code/ahmedgaitani/spaceship-titanic-code

 

Regression and Classification

https://amazingagenda.tistory.com/1

 

2. Text Data

Eng Sentiment Analysis: Support Vector Machine

https://www.kaggle.com/code/bansodesandeep/sentiment-analysis-support-vector-machine

 

KR Sentiment Classification: Random Forest, LSTM

https://qenthusiast.tistory.com/4 

 

3. Image Data

Flower Classification

https://qenthusiast.tistory.com/5

 

 

4. Tabular Text Image

https://amazingagenda.tistory.com/3

 

 

** Random Seed

1. Data Splitting

If you split your data into training and testing sets using a function like train_test_split from scikit-learn, you can set a random seed:

from sklearn.model_selection import train_test_split

# Splitting data into training and testing sets
train, test = train_test_split(cdf, test_size=0.2, random_state=42)

2. Model Training

When training machine learning models, you can set a random seed to ensure reproducibility. For example, with scikit-learn models:

from sklearn.ensemble import RandomForestRegressor

# Initializing a Random Forest model with a random seed
model = RandomForestRegressor(random_state=42)
model.fit(train.drop('target', axis=1), train['target'])

 

3. Setting Random Seed for Libraries

If you are using libraries that involve random number generation, you can set the seed globally:

NumPy

import numpy as np

np.random.seed(42)

 

TensorFlow

import tensorflow as tf

tf.random.set_seed(42)

 

PyTorch

import torch

torch.manual_seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)