Question on relation to univariate Taylor series propagation for higher-order derivatives #1

thisiscam · 2024-12-19T17:52:25Z

Hi,

Thank you for your interesting work! I’ve been reading your paper and found the idea of stochastically evaluating higher-order derivatives quite intriguing.

I was wondering if you could comment on how your technique relates to the known method of evaluating higher-order derivative tensors by propagating a set of univariate Taylor series. This idea is discussed in this paper and the Evaluating Derivatives book (Chapter 13).

Specifically, could your method be interpreted (or extended) as sampling a set of univariate Taylor series to propogate, to the end of approximating the derivative tensor? I’d love to hear your thoughts on this connection or any fundamental differences.

Thanks in advance for your response!

zekun-shi · 2025-01-10T03:49:41Z

Thanks for bringing up this highly relevant work! Upon reading the paper, I think the idea of evaluating arbitrary derivative tensor elements via forward propagation univariate Taylor series is similar. I wasn't aware of this work, and based on my experience talking to people at NeurIPS, this work is not well known within the ML community even for people who work on AD in the ML context. The JAX team who wrote the Taylor mode AD I used in the paper wasn't aware of this technique and was quite surprised that one could do this.

So yes your interpretation is right, STDE approximates derivative tensor with randomized univariate Taylor series propagation.

thisiscam · 2025-01-10T18:24:43Z

Thank you for your detailed response! I wanted to follow up with some additional thoughts and questions:

I agree that the method of propagating univariate Taylor series to compute higher-order derivative tensors is not widely known in the ML community, which is indeed unfortunate given its mention in the classic Evaluating Derivatives book. It's great to see that your work brings attention to this approach, even if from a different perspective!
That said, I think there's a key distinction between your method and GUW98. Specifically, GUW98 considers only first-order perturbations, whereas your work (as seen in Eq. (10) and Appendix F.1) includes higher-order perturbations.
Specifically, (it appears to me that) GUW98 only considers series inputs of the form
$$[v, 0 \ldots 0]$$ where $$v$$ is an integer linear combination of the basis vectors $e_i$,
while your work (please correct me if I'm wrong) considers
$$[\ldots 0 e_1 , 0, \ldots 0, e_2, 0, \ldots, 0, e_k]$$ that scatters the basis vector in the higher-order perturbations.
In Eq. (10) of your paper, the coefficient $$C_{d_1 \dots d_k}$$ corresponds to the coefficient associated with each directional derivative. However, I couldn't find a clear algorithm or explicit description in your paper for computing $$C$$. Could you clarify if you have a formula or algorithm for computing $$C$$ that generalizes Eq. (13) in GUW98? It would be interesting to explore whether including higher-order perturbations, as in your work, might lead to better algorithms or approximations in some contexts.
I found your comment in the paper on fully mixed partial derivatives $$\frac{\partial^k f}{\partial x_1 \dots \partial x_k}$$ (i.e., maximally non-diagonal operators) particularly interesting. Your paper mentions that the minimum $$l$$ required for these cases is $$(1 + k)k/2$$. However, when generalizing your example, I noticed that calculating the fully mixed partial derivative using a single higher-order univariate derivative (without cancellation steps) seems to require an exponential $$l = O(k \cdot 2^k)$$-th order derivative. For instance:
- When $$k = 2$$, positions $$[0, e_1, e_2]$$ work.
- When $$k = 3$$, positions $$[0, 0, 0, e_1, 0, e_2, e_3]$$ work.
- In general, the perturbation $$e_i$$ appears at index $$2^{k-1} + \ldots + 2^{k-i}$$ , leading to an overall length $$l = k 2^{k-1} - 1$$. Could you clarify if the quadratic $$l = (1 + k)k/2$$ formula in your paper offers a more efficient method or insight into improving this computation?

Looking forward to your thoughts!

thisiscam changed the title ~~Question on relation to univariate Taylor series for higher-order derivatives~~ Question on relation to univariate Taylor series propagation for higher-order derivatives Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on relation to univariate Taylor series propagation for higher-order derivatives #1

Question on relation to univariate Taylor series propagation for higher-order derivatives #1

thisiscam commented Dec 19, 2024

zekun-shi commented Jan 10, 2025

thisiscam commented Jan 10, 2025 •

edited

Loading

Question on relation to univariate Taylor series propagation for higher-order derivatives #1

Question on relation to univariate Taylor series propagation for higher-order derivatives #1

Comments

thisiscam commented Dec 19, 2024

zekun-shi commented Jan 10, 2025

thisiscam commented Jan 10, 2025 • edited Loading

thisiscam commented Jan 10, 2025 •

edited

Loading