When Numpy Arrays Don’t Behave
You’d think after surviving months of debugging C++ segmentation faults and interpreting cryptic ROOT errors, Python
with Numpy
would be a safe haven. That illusion doesn’t last long.
Let’s take something innocent:
a = np.arange(5)
b = a
b[0] = 10
If you expected a
to remain pure, that’s a rookie mistake. In Numpy, b
is just another name for the same array. Mutate one, mutate both. Python’s variable assignment semantics just references strike again, only now with higher stakes and fewer warnings.
Try slicing:
c = a[1:4]
c[0] = 12
Now, not only does c
see into a
, but so does every analysis step downstream. If you wanted an independent copy, you have to remember to explicitly call .copy()
. Forget once, and you can chase silent array mutations through your whole pipeline.
It gets more interesting. Slicing with a step (stride) or a reshape operation might give you a view, or it might give you a copy, depending on memory layout. There is no universal law here, only the documentation, some footnotes, and the silent hope that your array is “contiguous.”
And if you use fancy indexing:
d = a[[1, 2, 3]]
d[0] = 100
Now it is a copy, regardless of what you expect. Slicing is a view, but fancy indexing is a copy. It’s like a quantum state: observation collapses the outcome, and the only certainty is confusion.
For a PhD student, the lesson is clear. When working with Numpy arrays, always check if your operation returns a view or a copy preferably before the analysis meeting, not after. Otherwise, your physics might be reproducible, but not for the reasons you expect.