Coding Guide
Introduction
This guide describes common conventions, guidelines, and strategies for contributing code to PlasmaPy. The purpose of this guide is not to provide a set of rigid guidelines that must be adhered to, but rather to provide a common framework that helps us develop PlasmaPy together as a community.
Having a shared coding style makes it easier to understand code written by multiple contributors. The particulars of the coding style are not as important as consistency, readability, and maintainability.
This guide can (and should!) be regularly refined by the PlasmaPy community as we collectively learn new practices and our shared coding style changes. Please feel free to propose revisions to this guide by submitting a pull request or by bringing up an idea at a community meeting.
PlasmaPy generally follows the PEP 8 style guide for Python code, using auto-formatters such as black and isort that are executed using pre-commit.
Coding guidelines
Write short functions that do exactly one thing with no side effects.
Use NumPy array options instead of
for
loops to make code more compact, readable, and performant.Instead of defining variables like
a0
,a1
, &a2
, define these values in a collection such as anndarray
or alist
.Some plasma parameters depend on more than one
Quantity
of the same physical type. For example, when reading the following line of code, we cannot immediately tell which is the electron temperature and which is the ion temperature.f(1e6 * u.K, 2e6 * u.K)
Spell out the parameter names to improve readability and reduce the likelihood of errors.
f(T_i=1e6 * u.K, T_e=2e6 * u.K)
Similarly, when a function has parameters named
T_e
andT_i
, these parameters should be made keyword-only to avoid ambiguity and reduce the chance of errors.def f(*, T_i, T_e): ...
The
__eq__
and__ne__
methods of a class should not raise exceptions. If the comparison for equality is being made between objects of different types, these methods should returnFalse
instead. This behavior is for consistency with operations like1 == "1"
which will returnFalse
.Limit usage of
lambda
functions to one-liners, such as when defining the default factory of adefaultdict
). For anything longer than one line, usedef
instead.List and dictionary comprehensions can be used for simple
for
loops, like:>>> [x**2 for x in range(17) if x % 2 == 0] [0, 4, 16, 36, 64, 100, 144, 196, 256]
Avoid putting any significant implementation code in
__init__.py
files. Implementation details should be contained in a different file, and then imported into__init__.py
.Avoid defining global variables when possible.
Use
assert
statements only in tests.Use formatted string literals (f-strings) instead of legacy formatting for strings.
Functions that accept array_like or
Quantity
inputs should accept and returnnan
(not a number) values. This guideline applies whennan
is the input as well as whennan
values are included in an array.Tip
Normally,
numpy.nan == numpy.nan
evaluates toFalse
, which complicates testingnan
behavior. Theequal_nan
keyword of functions likenumpy.allclose
andnumpy.testing.assert_allclose
makes it so thatnan
is considered equal to itself.Do not use mutable objects as default values in the function or method declaration. This can lead to unexpected behavior.
>>> def function(l=[]): ... l.append("x") ... print(l) ... >>> function() ['x'] >>> function() ['x', 'x']
Use
pathlib
when working with paths to data files.
Names
Names are our most fundamental means of communicating the intent and purpose of code. Wisely chosen names can greatly improve the understandability of code, while inadequate names can obfuscate what the code is supposed to be doing.
PlasmaPy generally uses the PEP 8 conventions for variable names.
Use lowercase words separated by underscores for function and variable names (e.g.,
function_name
andvariable_name
).Use capitalized words without separators when naming a class (e.g.,
ClassName
), but keep acronyms capitalized (e.g.,MHDEquations
).Use capital letters words separated by underscores when naming constants (e.g.,
CONSTANT
orCONSTANT_NAME
).
There are some situations in PlasmaPy which justify a departure from the PEP 8 conventions.
Functions based on plasma parameters that are named after people may be capitalized (e.g.,
Alfven_speed
).Capital letters may be used for a variable when it matches the standard usage in plasma science (e.g.,
B
for magnetic field andT
for temperature).
Choose names that are pronounceable to make them more memorable and compatible with text-to-speech technology.
Choose names will produce more relevant results when searching the internet.
Avoid unnecessary abbreviations, as these make code harder to read. Prefer clarity over brevity, except for code that is used frequently and interactively (e.g., cd or ls).
Tip
Measure the length of a variable not by the number of characters, but rather by the time needed to understand its meaning.
By this measure,
cggglm
is significantly longer thansolve_gauss_markov_linear_model
.Avoid ambiguity. Does
temp
mean “temperature”, “temporary”, or “template”?Append
_e
to a variable name to indicate that it refers to electrons,_i
for ions, and_p
for protons (e.g.,T_e
,T_i
, andT_p
).Only ASCII characters should be used in code that is part of the public API.
Python allows alphanumeric Unicode characters to be used in object names (e.g.,
πλάσμα
orφυσική
). These characters may be used for internal code when doing so improves readability (i.e., to match a commonly used symbol) and in Jupyter notebooks.If a plasma parameter has multiple names, then use the name that provides the most physical insight. For example,
gyrofrequency
indicates gyration butLarmor_frequency
does not.It is usually preferable to name a variable after its name rather than its symbol. An object named
Debye_length
is more broadly understandable and searchable thanlambda_D
. However, there are some exceptions to this guideline.Symbols used widely across plasma science can be used with low risk of confusion, such as \(T\) for temperature or \(β\) for plasma
beta
.Symbols that are defined in docstrings can be used with decreased likelihood of confusion.
Sometimes code that represents an equation will be more readable if the Unicode characters for the symbols are used, especially for complex equations. For someone who is familiar with the symbols,
λ = c / ν
will be more readable thanlambda = c / nu
orwavelength = speed_of_light / frequency
.If an implementation is based on a journal article, then variable names may be based on the symbols used in that article. The article should be cited in the appropriate docstring so that it appears in the Bibliography.
To mark that an object is not part of PlasmaPy’s public API, begin its name with a leading underscore (e.g.,
_private_variable
). Private variables should not be included in__all__
.Avoid single character variable names except for standard plasma physics symbols (e.g.,
B
) or as indices infor
loops.Avoid encoding type information in a variable name.
Intermediate variable names can provide additional context and meaning. For example, suppose we have a conditional operating on a complicated expression:
if u[0] < x < u[1] and v[0] < y < v[1] and w[0] < z < w[1]: ...
Defining an intermediate variable allows us to communicate the meaning and intent of the expression.
point_is_in_grid_cell = u[0] < x < u[1] and v[0] < y < v[1] and w[0] < z < w[1] if point_is_in_grid_cell: ...
In
for
loops, this may take the form of assignment expressions with the walrus operator (:=
).
Tip
It is common for an integrated development environment (IDE) to have a built-in tool for simultaneously renaming a variable throughout a project. For example, a rename refactoring in PyCharm can be done with Shift+F6 on Windows or Linux, and ⇧F6 or ⌥⌘R on macOS.
Error messages
Error messages are a vital but underappreciated form of documentation. A good error message can help someone pinpoint the source of a problem in seconds, while a cryptic or missing error message can lead to hours of frustration.
Use error messages to indicate the source of the problem while providing enough information for the user to troubleshoot it. When possible, make it clear what the user should do next.
Include diagnostic information when appropriate. For example, if an error occurred at a single index in an array operation, then including the index where the error happened can help the user better understand the cause of the error.
Write error messages that are concise when possible, as users often skim or skip long error messages.
Avoid including information that is irrelevant to the source of the problem.
Write error messages in language that is plain enough to be understandable to someone who is undertaking their first research project.
If necessary, technical information may be placed after a plain language summary statement.
Alternatively, an error message may reference a docstring or a page in the narrative documentation.
Write error messages that are friendly, supportive, and helpful. Error message should never be condescending or blame the user.
Project infrastructure
Imports
Use standard abbreviations for imported packages:
import astropy.constants as const import astropy.units as u import matplotlib.pyplot as plt import numpy as np import pandas as pd
PlasmaPy uses isort to organize import statements via a pre-commit hook.
For infrequently used objects, import the package, subpackage, or module rather than the individual code object. Including more of the namespace provides contextual information that can make code easier to read. For example,
json.loads
is more readable than using onlyloads
.For frequently used objects (e.g.,
Particle
) and type hint annotations (e.g.,Optional
andReal
), import the object directly instead of importing the package, subpackage, or module. Including more of the namespace would increase clutter and decrease readability without providing commensurately more information.Use absolute imports (e.g.,
from plasmapy.particles import Particle
) rather than relative imports (e.g.,from ..particles import Particle
).Do not use star imports (e.g.,
from package.subpackage import *
), except in very limited situations.
Requirements
Package requirements are specified in
pyproject.toml
.tox.ini
also contains a testing environment for the minimal dependencies.Each release of PlasmaPy should support all minor versions of Python that have been released in the prior 42 months, and all minor versions of NumPy that have been released in the last 24 months. This schedule was proposed in NumPy Enhancement Proposal 29 for the scientific Python ecosystem, and has been adopted by upstream packages such as NumPy, matplotlib, and Astropy.
Tip
Tools like pyupgrade help automatically upgrade the code base to the minimum supported version of Python for the next release.
PlasmaPy should generally allow all feature releases of required dependencies made in the last ≲ 24 months, unless a more recent release includes a needed feature or bugfix.
Only set maximum or exact requirements (e.g.,
numpy <= 1.22.3
orscipy == 1.7.2
) when absolutely necessary. After setting a maximum or exact requirement, create a GitHub issue to remove that requirement.Tip
Maximum requirements can lead to version conflicts when installed alongside other packages. It is preferable to update PlasmaPy to become compatible with the latest versions of its dependencies than to set a maximum requirement.
Minor versions of Python are generally released in October of each year. However, it may take a few months before packages like NumPy and Numba become compatible with the newest minor version of Python.
Decorators
Transforming particle-like arguments into particle objects
Use particle_input()
to transform arguments to relevant Particle
,
CustomParticle
, or ParticleList
objects (see Particles).
Validating Quantity arguments
Use validate_quantities()
to verify function arguments and impose
relevant restrictions
from plasmapy.utils.decorators.validators import validate_quantities
@validate_quantities(
n={"can_be_negative": False},
validations_on_return={"equivalencies": u.dimensionless_angles()},
)
def inertial_length(n: u.m**-3, particle) -> u.m:
...
Use validate_quantities()
to enforce Quantity
type hints
@validate_quantities
def magnetic_pressure(B: u.T) -> u.Pa:
return B**2 / (2 * const.mu0)
Special function categories
Aliases
An alias is an abbreviated version of a commonly used function.
For example, va_
is an alias to
Alfven_speed
.
Aliases are intended to give users the option for shortening their code while maintaining some readability and explicit meaning. As such, aliases are given to functionality that already has a widely-used symbol in plasma literature.
Here is a minimal example of an alias f_
to function
as
would be defined in plasmapy/subpackage/module.py
.
__all__ = ["function"]
__aliases__ = ["f_"]
__all__ += __aliases__
def function():
...
f_ = function
"""Alias to `~plasmapy.subpackage.module.function`."""
Aliases should only be defined for frequently used plasma parameters which already have a symbol that is widely used in the community’s literature. This is to ensure that the abbreviated function name is still reasonably understandable. For example,
cwp_
is a shortcut for \(c/ω_p\).The name of an alias should end with a trailing underscore.
An alias should be defined immediately after the original function.
Each alias should have a one-line docstring that refers users to the original function.
The name of the original function should be included in
__all__
near the top of each module, and the name of the alias should be included in__aliases__
, which will then get appended to__all__
. This is done so both the alias and the original function get properly documented.Aliases are intended for end users, and should not be used in PlasmaPy or other collaborative software development efforts because of reduced readability and searchability for someone new to plasma science.
Lite Functions
Most functions in plasmapy.formulary
accept Quantity
instances as
arguments and use validate_quantities()
to verify that Quantity
arguments are valid. The use of Quantity
operations and validations do
not noticeably impact performance during typical interactive use, but
the performance penalty can become significant for numerically intensive
applications.
A lite-function is an optimized version of another plasmapy
function that accepts numbers and NumPy arrays in assumed SI units.
Lite-functions skip all validations and instead prioritize
performance. Most lite-functions are defined in
plasmapy.formulary
.
Caution
Unlike most formulary
functions, no validations are
performed on the arguments provided to a lite-function for
the sake of computational efficiency. When using
lite-functions, it is vital to double-check your
implementation!
Here is a minimal example of a lite-function function_lite
that corresponds to function
as would be defined in
plasmapy/subpackage/module.py
.
__all__ = ["function"]
__lite_funcs__ = ["function_lite"]
from numbers import Real
from numba import njit
from plasmapy.utils.decorators import bind_lite_func, preserve_signature
__all__ += __lite_funcs__
@preserve_signature
@njit
def function_lite(v: Real) -> Real:
"""
The lite-function which accepts and returns real numbers in
assumed SI units.
"""
...
@bind_lite_func(function_lite)
def function(v):
"""A function that accepts and returns Quantity arguments."""
...
The name of each lite-function should be the name of the original function with
_lite
appended at the end. For example,thermal_speed_lite
is the lite-function associated withthermal_speed
.Lite-functions assume SI units for all arguments that represent physical quantities.
Lite-functions should be defined immediately before the normal version of the function.
Lite-functions should be used by their associate non-lite counterpart, except for well reasoned exceptions. This is done to reduce code duplication.
Lite-functions are bound to their normal version as the
lite
attribute using thebind_lite_func
decorator. This allows the lite-function to also be accessed likethermal_speed.lite()
.If a lite-function is decorated with something like
@njit
, then it should also be decorated withpreserve_signature
. This preserves the function signature so interpreters can still give hints about function arguments.When possible, a lite-function should incorporate numba’s just-in-time compilation or utilize Cython. At a minimum any “extra” code beyond the raw calculation should be removed.
The name of the original function should be included in
__all__
near the top of each module, and the name of the lite-function should be included in__lite_funcs__
, which will then get appended to__all__
. This is done so both the lite-function and the original function get properly documented.
Physics
Units
PlasmaPy uses astropy.units
to assign physical units to values in the
form of a Quantity
.
>>> import astropy.units as u
>>> 5 * u.m / u.s
<Quantity 5. m / s>
Using astropy.units
improves compatibility with Python packages in
adjacent fields such as astronomy and heliophysics. To get started with
astropy.units
, check out this example notebook on units.
Only SI units should be used within PlasmaPy, unless there is a strong justification to do otherwise. Example notebooks may occasionally use other unit systems to show the flexibility of
astropy.units
.Use operations between
Quantity
instances except when needed for performance. To improve performance inQuantity
operations, check out performance tips forastropy.units
.Use unit annotations with the
validate_quantities()
decorator to validateQuantity
arguments and return values (see Validating Quantity arguments).Caution
Recent versions of Astropy allow unit-aware
Quantity
annotations such asu.Quantity[u.m]
. However, these annotations are not yet compatible withvalidate_quantities()
.Avoid using electron-volts as a unit of temperature within PlasmaPy because it is defined as a unit of energy. However, functions in
plasmapy.formulary
and elsewhere should accept temperatures in units of electron-volts, which can be done usingvalidate_quantities()
.Non-standard unit conversions can be made using equivalencies such as
temperature_energy
.>>> (1 * u.eV).to(u.K, equivalencies=u.temperature_energy()) 11604.518...
The names of SI units should not be capitalized except at the beginning of a sentence, including when they are named after a person. The sole exception is “degree Celsius”.
Particles
The Particle
class provides an object-oriented interface for accessing
basic particle data. Particle
accepts particle-like inputs.
>>> from plasmapy.particles import Particle
>>> alpha = Particle("He-4 2+")
>>> alpha.mass
<Quantity 6.6446...e-27 kg>
>>> alpha.charge
<Quantity 3.20435...e-19 C>
To get started with plasmapy.particles
, check out this example
notebook on particles.
Avoid using implicit default particle assumptions for function arguments (see issue #453).
The
particle_input()
decorator can automatically transform a particle-like argument into aParticle
,CustomParticle
, orParticleList
instance when the corresponding parameter is decorated withParticleLike
.from plasmapy.particles import ParticleLike, particle_input @particle_input def get_particle(particle: ParticleLike): return particle
If we use
get_particle
on something particle-like, it will return the corresponding particle object.>>> return_particle("p+") Particle("p+")
The documentation for
particle_input()
describes ways to ensure that the particle meets certain categorization criteria.
Equations and Physical Formulae
Physical formulae should be inputted without first evaluating all of the physical constants. For example, the following line of code obscures information about the physics being represented:
omega_ce = 1.76e7*(B/u.G)*u.rad/u.s # doctest: +SKIP
In contrast, the following line of code shows the exact formula which makes the code much more readable.
omega_ce = (e * B) / (m_e * c) # doctest: +SKIP
The origins of numerical coefficients in formulae should be documented.
Docstrings should describe the physics associated with these quantities in ways that are understandable to students who are taking their first course in plasma physics while still being useful to experienced plasma physicists.
Angular Frequencies
Unit conversions involving angles must be treated with care. Angles are
dimensionless but do have units. Angular velocity is often given in
units of radians per second, though dimensionally this is equivalent to
inverse seconds. Astropy will treat radians dimensionlessly when using
the dimensionless_angles
equivalency, but
dimensionless_angles
does not account for the multiplicative
factor of \(2π\) that is used when converting between frequency
(1/s) and angular frequency (rad/s). An explicit way to do this
conversion is to set up an equivalency between cycles/s and Hz:
import astropy.units as u
f_ce = omega_ce.to(u.Hz, equivalencies=[(u.cy/u.s, u.Hz)]) # doctest: +SKIP
However, dimensionless_angles
does work when dividing a velocity by
an angular frequency to get a length scale:
d_i = (c/omega_pi).to(u.m, equivalencies=u.dimensionless_angles()) # doctest: +SKIP
Example notebooks
Examples in PlasmaPy are written as Jupyter notebooks, taking advantage of their mature ecosystems. They are located in docs/notebooks. nbsphinx takes care of executing them at documentation build time and including them in the documentation.
Please note that it is necessary to store notebooks with their outputs stripped (use the “Edit -> Clear all” option in JupyterLab and the “Cell -> All Output -> Clear” option in the “classic” Jupyter Notebook). This accomplishes two goals:
helps with versioning the notebooks, as binary image data is not stored in the notebook
signals nbsphinx that it should execute the notebook.
Note
In the future, verifying and running this step may be automated via a GitHub bot. Currently, reviewers should ensure that submitted notebooks have outputs stripped.
If you have an example notebook that includes packages unavailable in
the documentation building environment (e.g., bokeh
) or runs some
heavy computation that should not be executed on every commit, keep the
outputs in the notebook but store it in the repository with a
preexecuted_
prefix (e.g.,
preexecuted_full_3d_mhd_chaotic_turbulence_simulation.ipynb
).
Compatibility with Prior Versions of Python, NumPy, and Astropy
PlasmaPy releases will generally abide by the following standards, which are adapted from NEP 29 for the support of old versions of Python, NumPy, and Astropy.
PlasmaPy should support at least the minor versions of Python initially released 42 months prior to a planned project release date.
PlasmaPy should support at least the 3 latest minor versions of Python.
PlasmaPy should support minor versions of NumPy initially released in the 24 months prior to a planned project release date or the oldest version that supports the minimum Python version (whichever is higher).
PlasmaPy should support at least the 3 latest minor versions of NumPy and Astropy.
The required major and minor version numbers of upstream packages may only change during major or minor releases of PlasmaPy, and never during patch releases.
Exceptions to these guidelines should only be made when there are major improvements or fixes to upstream functionality or when other required packages have stricter requirements.
Benchmarks
PlasmaPy has a set of asv benchmarks that monitor performance of its functionalities. This is meant to protect the package from performance regressions. The benchmarks can be viewed at benchmarks. They are generated from results located in benchmarks-repo. Detailed instructions on writing such benchmarks can be found at asv-docs. Up-to-date instructions on running the benchmark suite will be located in the README file of benchmarks-repo.
Comments
A well-placed and well-written comment can prevent future frustrations. However, comments are not inherently good. As code evolves, an unmaintained comment may become outdated, or get separated from the section of code that it was meant to describe. Cryptic and obsolete comments may end up confusing contributors. In the worst case, an unmaintained comment may contain inaccurate or misleading information (hence the saying that “a comment is a lie waiting to happen”).
Important
The code we write should read like a book. The full meaning of code’s functionality should be attainable by reading the code. Comments should only be used when the code itself cannot communicate its full meaning.
Refactor code to make it more readable, rather than explaining how it works [Wilson et al., 2014].
Instead of using a comment to define a variable, rename the variable to encode its meaning and intent. For example, code like:
could be achieved with no comment by doing:
Use comments to communicate information that you wish you knew before starting to work on a particular section of code, including information that took some time to learn.
Use comments to communicate information that the code cannot, such as why an alternative approach was not taken.
Use comments to include references to books or articles that describe the equation, algorithm, or software design pattern that is being implemented. Even better, include these references in docstrings.
Provide enough contextual information in the comment for a new user to be able to understand it.
Remove commented out code before merging a pull request.
When updating code, be sure to review and update, if necessary, associated comments too!
When a comment is used as the header for a section of code, consider extracting that section of code into its own function. For example, we might start out with a function that includes multiple lines of code for each step.
We can apply the extract function refactoring pattern by creating a separate function for each of these steps. The name of each function can often be extracted directly from the comment.
This refactoring pattern is appropriate for long functions where the different steps can be cleanly separated from each other. This pattern leads to functions that are shorter, more reusable, and easier to test. The original function contains fewer low-level implementation details and thus gives a higher level view of what the function is doing. This pattern reduces cognitive complexity.
The extract function refactoring pattern should be used judiciously, as taking it to an extreme and applying it at too fine of a scale can reduce readability and maintainability by producing overly fragmented code.
Hint
The extract function refactoring pattern might not be appropriate if the different sections of code are intertwined with each other (e.g., if both sections require the same intermediate variables). An alternative in such cases would be to create a class instead.