Skip to content

Latest commit

 

History

History
680 lines (492 loc) · 26.3 KB

File metadata and controls

680 lines (492 loc) · 26.3 KB

Log

Fri 18 Oct

Editing Python in Visual Studio Code | code.visualstudio.com/docs

"python.analysis.extraPaths": [
    "~/.local/lib/Google/google_appengine",
    "~/.local/lib/Google/google_appengine/lib/flask-0.12"
]
  • Copilot ad

importResolveFailure | code.visualstudio.com/docs

This error happens when Pylance is unable to find the package or module you're importing, nor its type stubs.

How to fix it

  • If you are importing a module, make sure it exists in your workspace or in a location that is included in the python.autoComplete.extraPaths setting.
  • If you are importing a package that is not installed, you can install it by running the following command in an activated terminal: python -m pip install {package_name}.
  • If you are importing a package that is already installed in a different interpreter or kernel, select the correct interpreter.
  • If you are working with an editable install and it is currently set up to use import hooks, consider switching to using .pth files that only contain file paths instead, to enhance compatibility and ensure smoother import behavior. Learn more in the Pyright documentation.

Мне кажется, установка Pylance решила все мои проблемы.

Sun 20 Oct

.pyi файлы - это стабы (stubs), их назначение и формат описаны в PEP 484. Эти файлы вообще никак не используются интерпретатором, их назначение - предоставлять информацию о типизации кода. К примеру, у тебя есть сторонний модуль без типизации, который не твой, редактировать его ты не можешь, а прописать типы хотелось бы:

# fizz.py
def greet(who):
    return f'Hello {who}'

Выход находится с помощью стаба: создаешь файл fizz.pyi, который содержит типизированную сигнатуру greet без имплементации:

# fizz.pyi
def greet(who: str) -> str:
    ...

Теперь утилиты статической типизации типа mypy или автодополнение кода в Pycharm знают, где найти типизацию для функции greet.

Что-то про Torch

>>> torch.ones(8)
tensor([1., 1., 1., 1., 1., 1., 1., 1.])
>>> torch.bernoulli(torch.ones(8) * 0.5)
tensor([1., 0., 1., 1., 1., 0., 1., 1.])
>>> torch.bernoulli(torch.ones(8) * 0.5)
tensor([1., 0., 1., 1., 0., 1., 1., 0.])
>>> torch.bernoulli(torch.ones(8) * 0.1)
tensor([0., 0., 0., 0., 0., 0., 0., 0.])
>>> torch.bernoulli(torch.ones(8) * 0.1)
tensor([0., 1., 0., 1., 0., 0., 0., 0.])

Tensor.expand | pytorch.org

Tensor.expand(*sizes) → Tensor

Returns a new view of the self tensor with singleton dimensions expanded to a larger size.

>>> x = torch.tensor([[1], [2], [3]])
>>> x.size()
torch.Size([3, 1])
>>> x.expand(3, 4)
tensor([[ 1,  1,  1,  1],
        [ 2,  2,  2,  2],
        [ 3,  3,  3,  3]])
>>> x.expand(-1, 4)   # -1 means not changing the size of that dimension
tensor([[ 1,  1,  1,  1],
        [ 2,  2,  2,  2],
        [ 3,  3,  3,  3]])

torch.unsqueeze | pytorch.org

torch.unsqueeze(input, dim) → Tensor

Returns a new tensor with a dimension of size one inserted at the specified position.

>>> x = torch.tensor([1, 2, 3, 4])
>>> torch.unsqueeze(x, 0)
tensor([[ 1,  2,  3,  4]])
>>> torch.unsqueeze(x, 1)
tensor([[ 1],
        [ 2],
        [ 3],
        [ 4]])

torch.squeeze | pytorch.org

torch.squeeze(input, dim: int | tuple[int, int, ...int] = None) → Tensor

Returns a tensor with all specified dimensions of input of size 1 removed.

For example, if input be of shape $(A \times 1 \times B \times C \times 1 \times D)$, then output be of shape $(A \times B \times C \times D)$.

BUT if dim is given, then a squeeze operation is done only in the given dimension(s).

Example:

>>> x = torch.zeros(2, 1, 2, 1, 2)
>>> x.size()
torch.Size([2, 1, 2, 1, 2])
>>> y = torch.squeeze(x)
>>> y.size()
torch.Size([2, 2, 2])
>>> y = torch.squeeze(x, 0)
>>> y.size()
torch.Size([2, 1, 2, 1, 2])
>>> y = torch.squeeze(x, 1)
>>> y.size()
torch.Size([2, 2, 1, 2])
>>> y = torch.squeeze(x, (1, 2, 3))
torch.Size([2, 2, 2])

*.__class__.__name__. в Python

Эта красивая штука работает вообще для всего:

>>> (1).__class__.__name__
'int'
>>> (None).__class__.__name__
'NoneType'
>>> (str).__class__.__name__
'type'
>>> (print).__class__.__name__
'builtin_function_or_method'

Текст можно писать прямо в превью!

---
title: "Important document"
output: pdf_document
bibliography: bibliography.bib
# Или даже:
bibliography: [bibliography1.bib, bibliography2.bib]
---

Поддерживается:

Format File extension
BibLaTeX .bib
BibTeX .bibtex
Copac .copac
CSL JSON .json
CSL YAML .yaml
EndNote .enl
EndNote XML .xml
ISI .wos
MEDLINE .medline
MODS .mods
RIS .ris

Converting back and forth between .py and .ipynb

(.venv) 127     JupyterNotebooksλ ~/.venv/bin/jupytext --to notebook hypertrain5.py
[jupytext] Reading hypertrain5.py in format py
[jupytext] Writing hypertrain5.ipynb (destination file replaced [use --update to preserve cell outputs and ids])

$ jupyter nbconvert --to script hypertrain5.ipynb

Расширяем diagonal_mask

diagonal_mask = diagonal_mask.expand(grad_output.size(1), -1).t()
Exception has occurred: RuntimeError
The expanded size of the tensor (1024) must match the existing size (12) at non-singleton dimension 0.  Target sizes: [1024, -1].  Tensor sizes: [12, 12]
  File "/Users/scales/JupyterNotebooks/hypertrain5.py", line 370, in backward
    diagonal_mask = diagonal_mask.expand(grad_output.size(1), -1).t()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/scales/JupyterNotebooks/hypertrain5.py", line 520, in train
    loss.backward()
  File "/Users/scales/JupyterNotebooks/hypertrain5.py", line 608, in <module>
    train(
RuntimeError: The expanded size of the tensor (1024) must match the existing size (12) at non-singleton dimension 0.  Target sizes: [1024, -1].  Tensor sizes: [12, 12]

:c

Пробуем что-то другое

diagonal_mask = diagonal_mask.unsqueeze(1).expand(-1, grad_output.size(1), -1)
diagonal_mask = diagonal_mask.permute(0, 2, 1)

grad_output = grad_output.unsqueeze(1) * diagonal_mask
grad_output = grad_output.sum(dim=1)

Запускаем

grad_output shape: torch.Size([12, 1024])
diagonal_mask shape: torch.Size([12, 12, 1024])

Exception has occurred: RuntimeError
mat2 must be a matrix
  File "/Users/scales/JupyterNotebooks/hypertrain5.py", line 380, in backward
    grad_output = grad_output.mm(diagonal_mask)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/scales/JupyterNotebooks/hypertrain5.py", line 525, in train
    loss.backward()
  File "/Users/scales/JupyterNotebooks/hypertrain5.py", line 613, in <module>
    train(
RuntimeError: mat2 must be a matrix

Одно измерение лишнее…

Там была одна лишняя строчка, попробую ещё раз

Средние значения градиента по итерациям Кривая потерь

Среднеквадратичное отклонение R^2

По сравнению с предыдущей итерацией:

  • MSE падает, а не растёт
  • $R^2$ растёт, а не падает

По определению (Коэффициент детерминации):

$$ R^2 = 1 - \frac{D[y|x]}{D[y]} $$

  • $D[y] = \sigma_y^2$ — дисперсия случайной величины (разницы между фактическими и предсказанными значениями)
  • $D[y|x] = \sigma^2$ — дисперсия ошибки модели

План дальше:

  • Поменять гиперпараметры??
  • Проверить на новых данных (??)
  • Анализ графика регрессионных остатков (??)

Fri 8 Nov

'Axes' object has no attribute 'xlabel' 'Axes' object has no attribute 'xscale'

matplotlib.axes.Axes | matplotlib.org

def matplotlib.axes.Axes(
  fig,
  *args,
  facecolor=None,
  frameon=True,
  sharex=None,
  sharey=None,
  label='',
  xscale=None,
  yscale=None,
  box_aspect=None,
  forward_navigation_events='auto',
  **kwargs
)

An Axes object encapsulates all the elements of an individual (sub-)plot in a figure.

It contains most of the (sub-)plot elements: Axis, Tick, Line2D, Text, Polygon, etc., and sets the coordinate system.

Like all visible elements in a figure, Axes is an Artist subclass.

То есть Axes это такой кусок графика… Поэтому у него нет .set_xscale().

Потому что скорее всего .set_xscale() мы будем вызывать на fig?

matplotlib.pyplot.subplots(
  nrows=1, 
  ncols=1, 
  *, 
  sharex=False, sharey=False, squeeze=True, width_ratios=None, 
  height_ratios=None, subplot_kw=None, gridspec_kw=None, **fig_kw
) -> (Figure, Axes)

matplotlib.figure.Figure | matplotlib.org

class matplotlib.figure.Figure(
  figsize=None, dpi=None, *,
  facecolor=None, edgecolor=None, linewidth=0.0, frameon=None, 
  subplotpars=None, tight_layout=None, constrained_layout=None, 
  layout=None, **kwargs
)

The top level container for all the plot elements.

See matplotlib.figure for an index of class methods.

matplotlib.figure | matplotlib.org

'Axes' object has no attribute 'xlabel' 'Axes' object has no attribute 'xscale'

matplotlib.artist | matplotlib.org

Artist.Tick.XTick.ThetaTick
      :    .YTick.RadialTick
      .Patch.FancyArrowPatch.ConnectionPatch
      :     .FancyBboxPatch
      .Collection

Tue 19 Nov

GridSearchCV | scikit-learn.org

CV means "cross-validation".

class sklearn.model_selection.GridSearchCV(
  # Must implement the scikit-learn estimator interface
  estimator,
   # { parametersName: list[parameterValue] } or a list
   # of such dictionaries, in which case the grids spanned
   # by each dictionary in the list are explored.
   # This enables searching over any sequence of parameter settings.
  param_grid,
  *,
  # If single score: string | (???) -> metricScore
  # If multiple score: list[string],
  #                or  tuple[string, ..., string],
  #                or  (???) -> dict[metricName :to metricScore],
  #                or  dict[metricName :to () -> metricScore] # I think so?
  scoring=None,
  n_jobs=None,
  refit=True,
  cv=None,
  verbose=0,
  pre_dispatch='2*n_jobs',
  error_score=nan,
  return_train_score=False
)

I want :keywords in Python. Can I have them?

Grid search workflow:

Grid search workflow diagram

Wed 20 Nov

GraniteMoe | huggingface.co/docs

Recent studies propose using small proxy models and small corpus to perform hyperparameter searches and transposing the optimal parameters to large models and large corpus.

Readline keybindings may be configured via an initialization file, typically .inputrc in your home directory.

Vi mode, tab completion in python:

python:bind -v 
python:bind ^I rl_complete

SageMath | sagemath.org

SageMath is a free open-source mathematics software system licensed under the GPL. It builds on top of many existing open-source packages: NumPy, SciPy, matplotlib, Sympy, Maxima, GAP¹, FLINT², R and many more. Access their combined power through a common, Python-based language or directly via interfaces or wrappers.

¹ Groups, Algorithms, Programming - A System for Computational Discrete Algebra ² Fast Library for Number Theory

Last update: Jan 1, 2023

Books:

Video lectures:

Blog posts:

How do you design good abstractions? By using abstractions that already exist.

"Abstraction is the elimination of the irrelevant and the amplification of the essential" — Robert C. Martin:

I don't know how to solve the prior knowledge requirement, other than to tell you to study.

I think a good abstraction should also be intuitive. What's the most intuitive abstractions ever? Mathematics.

Links to cool maths:

Interesting conjecture:

Mathematics is a big structure, but it's all based on intuitive axioms. Mathematics is intuitive.

Domain-Driven Design says:

Closure of Operations what? leads to object models reminiscent of arithmetic.

Like OOP, category theory is concerned with objects. Morphisms map objects to objects.

Suggestion. Any mathematical object can be described in terms of category theory and analyzed as such. Therefore category theory is the universal tool for analyzing mathematical objects. Therefore, very valuable to learn.

Thu 21 Nov

According to Bessis, however, the way math is taught in school emphasizes the logic-based part of this process, when the more important element is intuition.

Math should be thought of as a dialogue between the two: between reason and instinct, between language and abstraction.

The mathematician’s message is for everyone: Look at what you can do if you don’t give up on your intuition.

What we can learn from Descartes, Grothendieck and Thurston is that the nature of the mathematical process is not what people think.

At its core, Bessis says, mathematics is a game of back-and-forth between intuition and logic, instinct and reason.

  • intuition
  • logic
  • instinct
  • reason

$$ \mathrm{Mathematics} := \mathrm{Game}_{}(\mathrm{players:} I, L, i, R) $$

Utility function is our understanding? It can be quantified. Let $U$ be level of understanding, whatever that means.

Goal: improve mathematical intuition

...easy math is actually deep math that you’ve internalized.

Math is hard. But life, whatever you do, is extremely hard.

Bessis is inspired by the practices of mathematical greats like Bill Thurston, Alexander Grothendieck and René Descartes.

  • Maybe research these people?

Whenever you spot a disconnect between what your gut is telling you and what is supposed to be rational, it’s an important opportunity to understand something new.

When you do math, you’re exposed to the human thought process in a way that is really pure. It’s not just about understanding things, but about understanding things in a very childish, deep, naïve, super clear, obvious way.

What a great article, I award it 10/10⭐️

It inspires us to do math

Thinking

But what is the difference between logic and reason?

logic | ˈlɒdʒɪk | noun [mass noun]

  • reasoning conducted or assessed according to strict principles of validity: experience is a better guide to this than deductive logic | the logic of the argument is faulty.
  • a particular system or codification of the principles of proof and inference: Aristotelian logic.
  • the systematic use of symbolic and mathematical techniques to determine the forms of valid deductive argument.
  • the quality of being justifiable by reason: there seemed to be a lack of logic in his remarks.
  • (the logic of) the course of action suggested by or following as a necessary consequence of: the logic of private competition was to replace small firms by larger firms.

reason | ˈriːzn | noun [mass noun]

  • the power of the mind to think, understand, and form judgements logically: there is a close connection between reason and emotion.
  • what is right, practical, or possible; common sense: people are willing, within reason, to pay for schooling.
  • (one's reason) one's sanity: she is in danger of losing her reason.

Reason forms new thoughts, logic validates them?

Experiment

New report:

report['loss_AugmentedReLUNetwork_3_p0.99_N1'].length 8720

Old report:

report['loss_AugmentedReLUNetwork_3_p0.99_N1'].length 4360

plot_params = PlotParams(measurement='evaluate', dataset=1, metric='f1', p=0.01), Made a dataframe: Empty DataFrame
Columns: [0.25, 0.75, mean]
Index: []
Index([], dtype='object')

Oh right, there's only loss metrics in my report... I need to run evaluations again. And everything will work. Silly me.

$ mv report/report.json report-loss.json

  • Run evaluations again

[cat-theory]

Если воспринимать категории как "обобщённые частично упорядоченные множества", то монада – это "обобщённый оператор замыкания".

Sat 23 Nov

ChatGPT uses a lot of water.

The Lens Library | lens.github.io

It's a Haskell library for lenses.

This package provides families of lenses, isomorphisms, folds, traversals, getters, and setters.

The Lens Library FAQ | github.com/ekmett/lens/wiki

One goal of lens has been to provide a consistent vocabulary that lets you access and work with pure data of any sort, while retaining the ability to be able to reason about your code with laws.

It looks like the world wants me to use Haskell.

A typical Haskell package.

Highlight: Control.Monad.Trans.Identity.IdentityT

Sun 24 Nov

Layers are morphisms between tensors of dirrent types.

[cgt]: "Open games are... the morphisms of a symmetric monoidal category."

[100pmlb]

Supervised learning tasks:

  • classification
  • regression

Unsupervised learning tasks:

  • clustering
  • dimensionality reduction
  • outlier detection

Semi-supervised learning: many of the samples are unlabelled, but it should improve the quality of the model

Reinforcement learning: goal is long-term, decision-making is sequential