%load_ext autoreload
%autoreload 2
%matplotlib inline

At training time, we will typically want to put the model and the current mini batch on the GPU. When developing on a CPU, a GPU isn't available, so we define a variable that will automatically find the right device. This goes in utils rather than core to avoid circular imports with the callbacks module.

DEVICE

device(type='cpu')

/Users/hmamin/anaconda3/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.
  and should_run_async(code)

Checks if a function has a given argument.
Works with args and kwargs as well if you exclude the
stars. See example below.

Parameters
----------
func: function
arg: str
    Name of argument to look for.

Returns
-------
bool

Example
-------
def foo(a, b=6, *args):
    return

>>> hasarg(foo, 'b')
True

>>> hasarg(foo, 'args')
True

>>> hasarg(foo, 'c')
False

Quick wrapper to get mean and standard deviation of a tensor.

Parameters
----------
x: torch.Tensor
digits: int
    Number of digits to round mean and standard deviation to.

Returns
-------
tuple[float]

Wrapper to torch.cat which accepts tensors as non-keyword
arguments rather than requiring them to be wrapped in a list.
This can be useful if we've built some generalized functionality
where parameters must be passed in a consistent manner.

Parameters
----------
args: torch.tensor
    Multiple tensors to concatenate.
dim: int
    Dimension to concatenate on (last dimension by default).

Returns
-------
torch.tensor

Compute a weighted average of multiple tensors.

Parameters
----------
args: torch.tensor
    Multiple tensors with the same dtype and shape that you want to
    average.
weights: list
    Ints or floats to weight each input tensor. The length of this list
    must match the number of tensors passed in: the first weight will be
    multiplied by the first tensor, the second weight by the second
    tensor, etc. If your weights don't sum to 1, they will be normalized
    automatically.

Returns
-------
torch.tensor: Same dtype and shape as each of the input tensors.

/Users/hmamin/anaconda3/lib/python3.7/site-packages/ipykernel/ipkernel.py:287: DeprecationWarning: `should_run_async` will not call `transform_cell` automatically in the future. Please pass the result to `transformed_cell` argument and any exception that happen during thetransform in `preprocessing_exc_tuple` in IPython 7.17 and above.
  and should_run_async(code)

Temporarily copied from htools.

Returns the input argument. Sometimes it is convenient to have this if
we sometimes apply a function to an item: rather than defining a None
variable, sometimes setting it to a function, then checking if it's None
every time we're about to call it, we can set the default as identity and
safely call it without checking.

Parameters
----------
x: any

Returns
-------
x: Unchanged input.

Compare two dictionaries of tensors. The two dicts must have the
same keys.

Parameters
----------
d1: dict[any: torch.Tensor]
d2: dict[any: torch.Tensor]

Returns
-------
list: Returns the keys where tensors differ for d1 and d2.

Prints a list of the Tensors being tracked by the garbage collector.
From
https://forums.fast.ai/t/gpu-memory-not-being-freed-after-training-is-over/10265/8
with some minor reformatting.

Parameters
----------
gpu_only: bool
    If True, only find tensors that are on the GPU.

Returns
-------
None: Output is printed to stdout.

Use to determine the bias initializer for the final linear layer of
model.

Parameters
----------
x: float
    Value between 0 and 1 (e.g. the proportion of the training data that
    are postives).

Returns
-------
float: Inverse sigmoid of input.
    I.E. if y=sigmoid(x), inverse_sigmoid(y)=x.

Helper to initialize a layer's bias term to a constant. This is
particularly useful for the final layer of a binary classifier where it's
often helpful to intialize it to the value that, when passed through a
sigmoid activation, is equal to the percent of your dataset belonging to
the majority class. This reduces the chance that the first epoch or so
will be spent simply learning a bias term, and has two benefits:

1. May reduce training time slightly.
2. The beginning of training can be deceptively important - a messy first
couple epochs can have long-lasting repercussions. This is often hard to
identify without in-depth digging into model weights so it often goes
unnoticed.

Parameters
----------
layer: nn.Module
    The layer to initialize a bias for. This will often be the last layer
    of our network - we rarely need to initialize a constant bias
    otherwise.
value: float or None
    If provided, the bias will be initialized to this value.
target_pct: float or None
    If provided, must be a float between 0 and 1.

Returns
-------
None: The layer is updated in place.

Check if an object belongs to the Python standard library.

Parameters
----------
drop_callables: bool
    If True, we won't consider callables (classes/functions) to be builtin.
    Classes have class `type` and functions have class
    `builtin_function_or_method`, both of which are builtins - however,
    this is often not what we mean when we want to know if something is
    built in. Note: knowing the class alone is not enough to determine if
    the objects it creates are built-in; this may depend on the kwargs
    passed to its constructor. This will NOT check if a class was defined
    in the standard library.

Returns
-------
bool: True if the object is built-in. If the object is list-like, each
item will be checked as well the container. If the object is dict-like,
each key AND value will be checked (you can always pass in d.keys() or
d.values() for more limited checking). Again, the container itself will
be checked as well.

Try to extract number of output features from the last layer of a
model. This is often useful when building encoder-decoder models or
stacking encoders and classification heads. Not sure how airtight the
logic is here so use with caution.

Parameters
----------
model: nn.Module
    Model to examine.

Returns
-------
int: Number of output features.

Utils

`reproducible`[source]

`gpu_setup`[source]

`hasarg`[source]

`quick_stats`[source]

`concat`[source]

`weighted_avg`[source]

`identity`[source]

`tensor_dict_diffs`[source]

`find_tensors`[source]

`inverse_sigmoid`[source]

`init_bias_constant_`[source]

`is_builtin`[source]

`out_features`[source]

Utils

reproducible[source]

gpu_setup[source]

hasarg[source]

quick_stats[source]

concat[source]

weighted_avg[source]

identity[source]

tensor_dict_diffs[source]

find_tensors[source]

inverse_sigmoid[source]

init_bias_constant_[source]

is_builtin[source]

out_features[source]