, , ,

Basic deep learning with aquaponics, Part 4

PART 4: CREATING A DEEP LEARNING MODEL WITH LSTM

This is a continuation of Basic deep learning with aquaponics, Part 3.

We will create an LSTM model with Pond 4 as it seems to be the most consistent. A common occurrence among the ponds is that the ammonia readings are astronomically high, which is common when using affordable sensors with microcontrollers such as ESP32. This is something we need to keep in mind once we start the data engineering section, whether we should invest in better sensors.

There are a few minor things we have to do before we can use the data from Pond 4. Some of the dates are in the wrong order and there is a gap of over 4 days among the dates. We also have to limit ammonia to a certain level – we set it to 1000.

pond4=pd.read_csv('IoTPond4.csv')
pond4.dropna(inplace=True)
pond4.rename(columns={'created_at':'date', 'Temperature(C)':'temp', 'Dissolved Oxygen(g/ml)':'DO','Ammonia(g/ml)':'ammonia','Nitrate(g/ml)':'nitrate', 'Fish_Weight(g)':'weight'}, inplace=True)
pond4=pond4[['date','temp','DO','PH','ammonia','nitrate','weight']]
pond4.date = pd.to_datetime(pond4.date)
pond4.sort_values(by='date',inplace=True)
pond4.drop_duplicates(subset=['date'], inplace=True)
pond4.reset_index(drop=True,inplace=True)

data=pond4.copy()
data=data[data.ammonia < 10000]
data.reset_index(drop=True, inplace=True)
num_features = data.iloc[:,1:]
columns=num_features.columns
scaler = MinMaxScaler()
data_scaled=pd.DataFrame(scaler.fit_transform(num_features),columns=columns)

columns_to_plot = ['DO','PH','ammonia','nitrate','weight']
data_scaled['date']=data['date']
data_to_plot=data_scaled[data_scaled.index % 15000 == 0]
plt.figure(figsize=(12,5))
for col in columns_to_plot:
    plt.plot(data_to_plot.date, data_to_plot[col], label = col, marker='o')

plt.title('Scaled Parameters Versus Dates')
plt.xticks(rotation=45)
plt.legend()
plt.show()

If you haven’t done so, please read Basic deep learning with aquaponics, Part 3 for basic understanding of aquaponics. For example, the graph of pH and fish weight is as expected.

Since the data from Pond 4 is a timeseries, that is, measurements obtained at regular intervals of time, we must use recurrent neural network (RNN). The measurements from Pond 4 is not quite at a regular interval, though most are 20-second apart, with average over 1 minute. Among the different architectures of RNN, we will use the Long Short-Term Memory (LSTM). For understanding RNN and LSTM, please refer to Deep Learning with Python by Francois Chollet. Our construction will follow that of Chollet’s.

From the above code, data is a copy of Pond 4 with a filter to limit ammonia. We partition data into three sets: 50% training, 25% validation, and 25% test. We must partition before transforming the data to prevent data leakage. We transform each dataset separately and concatenate into one single dataset.

In the following selection, each element of a dataset is 3-dimensional tensor of the form (samples, targets). We choose every 180-th sample, which is approximately every 1 hour. Each sequence has 12 hours worth of data.

from tensorflow import keras
from tensorflow.keras import layers

raw_data = data['PH','nitrate','ammonia']
weights = data['weight'].values()
L = len(raw_data)
T = int(0.5*L)
V = int(0.25*L)

three_types = [(raw_data[:T],'train'), (raw_data[T:T+V],'validation'), (raw_data[T+V:],'test')]
scaled_datas = preprocess(three_types)
df = pd.concat(scaled_datas)
df = df.values()

sampling_rate = 180
sequence_length = 12
batch_size = 32

train_dataset = keras.utils.timeseries_dataset_from_array(
    df,
    targets = weights,
    sampling_rate = sampling_rate,
    sequence_length = sequence_length,
    shuffle = True,
    batch_size = batch_size,
    start_index = 0,
    end_index = T)

val_dataset = keras.utils.timeseries_dataset_from_array(
    df,
    targets = weights,
    sampling_rate = sampling_rate,
    sequence_length = sequence_length,
    shuffle = True,
    batch_size = batch_size,
    start_index = T,
    end_index = T+V)

test_dataset = keras.utils.timeseries_dataset_from_array(
    df,
    targets = weights,
    sampling_rate = sampling_rate,
    sequence_length = sequence_length,
    shuffle = True,
    batch_size = batch_size,
    start_index = T+V)

We use two layers of LSTM with 32 neurons each and recurrent_dropout of 50% to prevent overfitting. We then add a dense layer with one unit for the output. So, yea, the errors are bad. This is not surprising as the data is pretty erratic due to sensor errors. Your model can only be as good as your data, so I will probably try to find better sensors in the future.

inputs = keras.Input(shape=(sequence_length, df.shape[-1]))
x=layers.LSTM(32, recurrent_dropout=0.5,return_sequences=True)(inputs)
x=layers.LSTM(32, recurrent_dropout=0.5)(x)
x=layers.Dropout(0.5)(x)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs,outputs)

callbacks = [
    keras.callbacks.ModelCheckpoint("aquaponics", save_best_only=True)
]

model.compile(optimizer = 'Adam', loss = 'mse', metrics = ['mae'])
history = model.fit(train_dataset,epochs=50, 
                    validation_data=val_dataset, 
                    callbacks=callbacks)

loss = history.history['mae']
val_loss = history.history['val_mae']
epochs = range(1, len(loss)+1)
plt.figure()
plt.plot(epochs, loss, 'bo', label = 'Training MAE')
plt.plot(epochs, val_loss, 'gs', label = 'Validation MAE')
plt.xlabel('Epoch')
plt.ylabel('MAE')
plt.title('Training and Validation MAE')
plt.legend()
plt.show()

There are multiple reasons why the model is bad: (1) I am not good at creating an optimal model, (2) there is too much noise in the data. A few things we can do: (1) filter the data to only include rows that are 20 seconds apart, (2) be less greedy by letting sampling_rate = 30 and sequence_length = 60 (10 hours worth), (3) include early stopping to stop the training if the validation loss does not improve for 10 epochs.

inputs = keras.Input(shape=(sequence_length, df.shape[-1]))
x=layers.LSTM(64, recurrent_dropout=0.25)(inputs)
x=layers.Dropout(0.5)(x)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs,outputs)

callbacks = [
    keras.callbacks.ModelCheckpoint("aquaponics", save_best_only=True),
    keras.callbacks.EarlyStopping(monitor="val_loss", mode="min", patience=10)
]

model.compile(optimizer = 'adam', loss = 'mse', metrics = ['mae'])
history = model.fit(train_dataset,epochs=50, 
                    validation_data=val_dataset, 
                    callbacks=callbacks)

Once we have an optimal model, we can use the model to see how well it does on the test data. In our case, the MAE is 82.5318. Yikes!

model = keras.models.load_model('aquaponics')
print(f"Test MAE : {model.evaluate(test_dataset)[1]:.2f}")

Here is a closer look at the data. The model has no chance. In the beginning, I thought of using affordable sensors with a microcontroller but I am having second thought.

A natural question that comes up is how to retrain the model once we have a new significant amount of new data. For example, let’s say we have new dataset from the aquaponics that we ran for a whole year. How do we train the our model to get a new model without degrading the old model? Though you can use some sort of ensemble technique, I think the safest route is to just create a new model that is trained on the combined data. See here for full explanation.

In the next post, we will construct an image classifier using CNN to determine whether the plants we grow require attention. Since I love water spinach (or morning glory), we can use CNN on images of water spinach. Water spinach has been banned in Georgia for over 15 years. I am glad it’s finally legal to grow again!

Leave a comment