Up データの前処理 作成: 2021-04-21
更新: 2021-04-23


    ここまでの経過:
      $ source [venv のパス]/venv/bin/activate (venv) $ python >>> import tensorflow as tf >>> tf.enable_eager_execution() >>> train_file_path = "[ / からのパス]/.keras/datasets/train.csv" >>> test_file_path = "[ / からのパス]/.keras/datasets/eval.csv" >>> LABEL_COLUMN = 'survived' >>> def get_dataset(file_path, **kwargs): ... dataset = tf.data.experimental.make_csv_dataset( ... file_path, ... batch_size=5, ... label_name=LABEL_COLUMN, ... na_value="?", ... num_epochs=1, ... ignore_errors=True, ... **kwargs) ... return dataset ... >>> >>> raw_train_data = get_dataset(train_file_path) >>> raw_test_data = get_dataset(test_file_path) >>> import numpy as np >>> np.set_printoptions(precision=3, suppress=True) >>> def show_batch(dataset): ... for batch, label in dataset.take(1): ... for key, value in batch.items(): ... print("{:20s}: {}".format(key,value.numpy())) ... >>>


    Because of the different data types and ranges you can't simply
      stack the features into NumPy array and
      pass it to a keras.Sequential model.
    Each column needs to be handled individually.

    As one option, you could preprocess your data offline (using any tool you like) to convert categorical columns to numeric columns, then pass the processed output to your TensorFlow model.
    The disadvantage to that approach is that if you save and export your model the preprocessing is not saved with it.
    The experimental.preprocessing layers avoid this problem because they're part of the model.

    In this example, you'll build a model that implements the preprocessing logic using Keras functional API.
    You could also do it by subclassing.

    The functional API operates on "symbolic" tensors.
    Normal "eager" tensors have a value.
    In contrast, these "symbolic" tensors do not.
    Instead they keep track of which operations are run on them, and build representation of the calculation, that you can run later.


    かくして,モデルは preprocessing_layer を入力レイヤーとする。
    実際,つぎの4層でつくることにする:
    1. preprocessing_layer
    2. tf.keras.layers.Dense(128, activation='relu')
    3. tf.keras.layers.Dense(128, activation='relu')
    4. tf.keras.layers.Dense(1)