Skip to content
Snippets Groups Projects
Commit 5b03fbaa authored by dh00601's avatar dh00601
Browse files

fixed notebook for luminosity functions

parent 60647b84
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id:bbbaafbb-fd7d-4b73-a970-93506ba35d71 tags:
# Example use case: Zero-age stellar luminosity function
In this notebook we compute the luminosity function of the zero-age main-sequence by running a population of single stars using binary_c.
We start by loading in some standard Python modules and the binary_c module.
%% Cell type:code id:bf6b8673-a2b5-4b50-ad1b-e90671f57470 tags:
``` python
import os
import math
import matplotlib.pyplot as plt
from binarycpython.utils.functions import temp_dir
from binarycpython.utils.grid import Population
TMP_DIR = temp_dir("notebooks", "notebook_luminosity")
# help(Population) # Uncomment this line to see the public functions of this object
```
%% Cell type:markdown id:f268eff3-4e08-4f6b-8b59-f22dba4d2074 tags:
## Setting up the Population object
To set up and configure the population object we need to make a new instance of the `Population` object and configure it with the `.set()` function.
In our case, we only need to set the maximum evolution time to something short, because we care only about zero-age main sequence stars which have, by definition, age zero.
%% Cell type:code id:79ab50b7-591f-4883-af09-116d1835a751 tags:
``` python
# Create population object
population = Population()
# If you want verbosity, set this before other things
population.set(verbosity=1)
# Setting values can be done via .set(<parameter_name>=<value>)
# Values that are known to be binary_c_parameters are loaded into bse_options.
# Those that are present in the default grid_options are set in grid_options
# All other values that you set are put in a custom_options dict
population.set(
# binary_c physics options
max_evolution_time=0.1, # maximum stellar evolution time in Myr
tmp_dir=TMP_DIR,
)
# We can access the options through
print("verbosity is", population.grid_options['verbosity'])
```
%% Output
adding: tmp_dir=/tmp/binary_c_python-david/notebooks/notebook_luminosity to grid_options
verbosity is 1
%% Cell type:markdown id:f9a65554-36ab-4a04-96ca-9f1422c307fd tags:
## Adding grid variables
The main purpose of the Population object is to handle the population synthesis side of running a set of stars. The main method to do this with binarycpython, as is the case with Perl binarygrid, is to use grid variables. These are loops over a predefined range of values, where a probability will be assigned to the systems based on the chosen probability distributions.
Usually we use either 1 mass grid variable, or a trio of mass, mass ratio and period (other notebooks cover these examples). We can, however, also add grid sampling for e.g. eccentricity, metallicity or other parameters.
To add a grid variable to the population object we use `population.add_grid_variable`
%% Cell type:code id:68c84521-9ae8-4020-af7a-5334173db969 tags:
``` python
# help(population.add_grid_variable)
```
%% Cell type:markdown id:2a9104fc-4136-4e53-8604-f24ad52fbe56 tags:
First let us set up some global variables that will be useful throughout.
* The resolution is the number of stars we simulate in our model population.
* The massrange is a list of the min and max masses
* The total_probability is the theoretical integral of a probability density function, i.e. 1.0.
* The binwidth sets the resolution of the final distribution. If set to 0.5, the bins in log*L* are 0.5dex wide.
%% Cell type:code id:aba3fe4e-18f2-4bb9-8e5c-4c6007ab038b tags:
``` python
# Set resolution and mass range that we simulate
resolution = {"M_1": 40} # start with resolution = 10, and increase later if you want "more accurate" data
massrange = (0.07, 100.0) # we work with stars of mass 0.07 to 100 Msun
total_probability = 1.0 # theoretical integral of the mass probability density function over all masses
# distribution binwidths :
# (log10) luminosity distribution
binwidth = { 'luminosity' : 0.5 }
```
%% Cell type:markdown id:1b3a007b-5c17-42a7-a981-7e268e6f545c tags:
The next cell contains an example of adding the mass grid variable, sampling the phase space in linear mass *M*_1.
%% Cell type:code id:47979841-2c26-4b26-8945-603d013dc93a tags:
``` python
# Mass
population = Population()
population.set(
tmp_dir=TMP_DIR,
)
population.add_grid_variable(
name="M_1",
longname="Primary mass",
valuerange=massrange,
samplerfunc="self.const_linear({min}, {max}, {res})".format(
min=massrange[0],
max=massrange[1],
res=resolution["M_1"]
),
probdist="{probtot}/({max} - {min})".format(
probtot=total_probability,
min=massrange[0],
max=massrange[1]
), # dprob/dm1 : all stars are equally likely so this is 1.0 / (Mmax - Mmin)
dphasevol="dM_1",
parameter_name="M_1",
condition="", # Impose a condition on this grid variable. Mostly for a check for yourself
)
```
%% Cell type:markdown id:163f13ae-fec1-4ee8-b9d4-c1b75c19ff39 tags:
## Setting logging and handling the output
By default, binary_c will not output anything (except for 'SINGLE STAR LIFETIME'). It is up to us to determine what will be printed. We can either do that by hardcoding the print statements into `binary_c` (see documentation binary_c) or we can use the custom logging functionality of binarycpython (see notebook `notebook_custom_logging.ipynb`), which is faster to set up and requires no recompilation of binary_c, but is somewhat more limited in its functionality. For our current purposes, it works perfectly well.
After configuring what will be printed, we need to make a function to parse the output. This can be done by setting the parse_function parameter in the population object (see also notebook `notebook_individual_systems.ipynb`).
In the code below we will set up both the custom logging and a parse function to handle that output.
%% Cell type:code id:0c986215-93b1-4e30-ad79-f7c397e9ff7d tags:
``` python
# Create custom logging statement
#
# we check that the model number is zero, i.e. we're on the first timestep (stars are born on the ZAMS)
# we make sure that the stellar type is <= MAIN_SEQUENCE, i.e. the star is a main-sequence star
# we also check that the time is 0.0 (this is not strictly required, but good to show how it is done)
#
# The Printf statement does the outputting: note that the header string is ZERO_AGE_MAIN_SEQUENCE_STAR
custom_logging_statement = """
if(stardata->model.model_number == 0 &&
stardata->star[0].stellar_type <= MAIN_SEQUENCE &&
stardata->model.time == 0)
{
/* Note that we use Printf - with a capital P! */
Printf("ZERO_AGE_MAIN_SEQUENCE_STAR %30.12e %g %g %g %g\\n",
stardata->model.time, // 1
stardata->common.zero_age.mass[0], // 2
stardata->star[0].mass, // 3
stardata->star[0].luminosity, // 4
stardata->model.probability // 5
);
};
"""
population.set(
C_logging_code=custom_logging_statement
)
```
%% Cell type:markdown id:ae1f1f0c-1f8b-42d8-b051-cbf8c6b51514 tags:
The parse function must now catch lines that start with "ZERO_AGE_MAIN_SEQUENCE_STAR" and process the associated data.
%% Cell type:code id:fd197154-a8ce-4865-8929-008d3483101a tags:
``` python
# import the bin_data function so we can construct finite-resolution probability distributions
# import the datalinedict to make a dictionary from each line of data from binary_c
from binarycpython.utils.functions import bin_data,datalinedict
def parse_function(self, output):
"""
Example parse function
"""
# list of the data items
parameters = ["header", "time", "zams_mass", "mass", "luminosity", "probability"]
# Loop over the output.
for line in output.splitlines():
# obtain the line of data in dictionary form
linedata = datalinedict(line,parameters)
# Check the header and act accordingly
if linedata['header'] == "ZERO_AGE_MAIN_SEQUENCE_STAR":
# bin the log10(luminosity) to the nearest 0.1dex
binned_log_luminosity = bin_data(math.log10(linedata['luminosity']),
binwidth['luminosity'])
# append the data to the results_dictionary
self.grid_results['luminosity distribution'][binned_log_luminosity] += linedata['probability']
# verbose reporting
#print("parse out results_dictionary=",self.grid_results)
# Add the parsing function
population.set(
parse_function=parse_function,
)
```
%% Output
adding: parse_function=<function parse_function at 0x7f6920fd2430> to grid_options
%% Cell type:markdown id:91509ce5-ffe7-4937-aa87-6d7baac9ac04 tags:
## Evolving the grid
Now that we configured all the main parts of the population object, we can actually run the population! Doing this is straightforward: `population.evolve()`
This will start up the processing of all the systems. We can control how many cores are used by settings `num_cores`. By setting the `verbosity` of the population object to a higher value we can get a lot of verbose information about the run, but for now we will set it to 0.
There are many grid_options that can lead to different behaviour of the evolution of the grid. Please do have a look at those: [grid options docs](https://ri0005.pages.surrey.ac.uk/binary_c-python/grid_options_descriptions.html), and try
%% Cell type:code id:8ea376c1-1e92-45af-8cab-9d7fdca564eb tags:
``` python
# set number of threads
population.set(
# verbose output is not required
verbosity=0,
# set number of threads (i.e. number of CPU cores we use)
num_cores=2,
)
# Evolve the population - this is the slow, number-crunching step
analytics = population.evolve()
# Show the results (debugging)
# print (population.grid_results)
```
%% Output
adding: verbosity=0 to grid_options
Do dry run? True
Doing dry run to calculate total starcount and probability
Grid has handled 39 stars with a total probability of 1
**************************
* Dry run *
* Total starcount is 39 *
* Total probability is 1 *
**************************
Do join of subprocesses ...
Joined subprocesses.
**********************************************************
* Population-f9b28e4ed6ec4a67b17cd86c5a43c41c finished! *
* Population-6afcef10590f48f3b089a6e5bffc70c4 finished! *
* The total probability is 1. *
* It took a total of 3.99s to run 39 systems on 2 cores *
* = 7.98s of CPU time. *
* Maximum memory use 343.570 MB *
* It took a total of 1.11s to run 39 systems on 2 cores *
* = 2.22s of CPU time. *
* Maximum memory use 335.164 MB *
**********************************************************
No failed systems were found in this run.
Do analytics
Added analytics to metadata
%% Cell type:markdown id:91ab45c7-7d31-4543-aee4-127ab58e891f tags:
After the run is complete, some technical report on the run is returned. I stored that in `analytics`. As we can see below, this dictionary is like a status report of the evolution. Useful for e.g. debugging.
%% Cell type:code id:e1f0464b-0424-4022-b34b-5b744bc2c59d tags:
``` python
print(analytics)
```
%% Output
{'population_id': 'f9b28e4ed6ec4a67b17cd86c5a43c41c', 'evolution_type': 'grid', 'failed_count': 0, 'failed_prob': 0, 'failed_systems_error_codes': [], 'errors_exceeded': False, 'errors_found': False, 'total_probability': 0.9999999999999999, 'total_count': 39, 'start_timestamp': 1655508316.7679594, 'end_timestamp': 1655508320.7581806, 'time_elapsed': 3.9902212619781494, 'total_mass_run': 1951.365, 'total_probability_weighted_mass_run': 50.035, 'zero_prob_stars_skipped': 0}
{'population_id': '6afcef10590f48f3b089a6e5bffc70c4', 'evolution_type': 'grid', 'failed_count': 0, 'failed_prob': 0, 'failed_systems_error_codes': [], 'errors_exceeded': False, 'errors_found': False, 'total_probability': 1.0, 'total_count': 39, 'start_timestamp': 1655551739.2635226, 'end_timestamp': 1655551740.371832, 'time_elapsed': 1.108309268951416, 'total_mass_run': 1951.3650000000002, 'total_probability_weighted_mass_run': 50.035, 'zero_prob_stars_skipped': 0}
%% Cell type:code id:05c6d132-abee-423e-b1a8-2039c8996fbc tags:
``` python
# make a plot of the luminosity distribution using Seaborn and Pandas
import seaborn as sns
import pandas as pd
from binarycpython.utils.functions import pad_output_distribution
# set up seaborn for use in the notebook
sns.set(rc={'figure.figsize':(20,10)})
sns.set_context("notebook",
font_scale=1.5,
rc={"lines.linewidth":2.5})
# this saves a lot of typing!
ldist = population.grid_results['luminosity distribution']
# pad the distribution with zeros where data is missing
pad_output_distribution(ldist,
binwidth['luminosity'])
# make pandas dataframe from our sorted dictionary of data
plot_data = pd.DataFrame.from_dict({'ZAMS luminosity distribution' : ldist})
# make the plot
p = sns.lineplot(data=plot_data)
p.set_xlabel("$\log_{10}$ ($L_\mathrm{ZAMS}$ / L$_{☉}$)")
p.set_ylabel("Number of stars")
p.set(yscale="log")
```
%% Output
[None]
%% Cell type:markdown id:7d7b275e-be92-4d59-b44d-ef6f24023cc3 tags:
Does this look like a reasonable stellar luminosity function to you? The implication is that the most likely stellar luminosity is 10<sup>5.8</sup> L<sub></sub>! Clearly, this is not very realistic... let's see what went wrong.
%% Cell type:markdown id:e32c3bbf-390f-45da-ad9c-cc3e7c9449dc tags:
## ZAMS Luminosity distribution with the initial mass function
In the previous example, all the stars in our grid had an equal weighting. This is very unlikely to be true in reality: indeed, we know that low mass stars are far more likely than high mass stars. So we now include an initial mass function as a three-part power law based on Kroupa (2001). Kroupa's distribution is a three-part power law: we have a function that does this for us (it's very common to use power laws in astrophysics).
%% Cell type:code id:1f37d2c0-1108-4ab9-a309-20b1e6b6e3fd tags:
``` python
# Update the probability distribution to use the three-part power law IMF
population.update_grid_variable(
name="M_1",
probdist="self.three_part_powerlaw(M_1, 0.1, 0.5, 1.0, 150, -1.3, -2.3, -2.3)",
)
```
%% Cell type:code id:6f4463e8-1935-45f2-8c5f-e7b215f8dc47 tags:
``` python
# Clean and re-evolve the population
population.clean()
analytics = population.evolve()
# Show the results (debugging)
# print (population.grid_results)
```
%% Output
Do dry run? True
Doing dry run to calculate total starcount and probability
Grid has handled 39 stars with a total probability of 0.211729
**********************************
* Dry run *
* Total starcount is 39 *
* Total probability is 0.211729 *
**********************************
Do join of subprocesses ...
Joined subprocesses.
**********************************************************
* Population-1d1d556abeae4549aa28f9740807dc84 finished! *
* Population-452bcf6eb93e4e2493019ec93ee250df finished! *
* The total probability is 0.211729. *
* It took a total of 3.90s to run 39 systems on 2 cores *
* = 7.80s of CPU time. *
* Maximum memory use 519.211 MB *
* It took a total of 1.27s to run 39 systems on 2 cores *
* = 2.54s of CPU time. *
* Maximum memory use 468.883 MB *
**********************************************************
No failed systems were found in this run.
Do analytics
Added analytics to metadata
%% Cell type:code id:cfe45a9e-1121-43b6-b6b6-4de6f8946a18 tags:
``` python
# plot luminosity distribution
ldist = population.grid_results['luminosity distribution']
# pad the distribution with zeros where data is missing
pad_output_distribution(ldist,
binwidth['luminosity'])
# make pandas dataframe from our sorted dictionary of data
plot_data = pd.DataFrame.from_dict({'ZAMS luminosity distribution' : ldist})
# make the plot
p = sns.lineplot(data=plot_data)
p.set_xlabel("$\log_{10}$ ($L_\mathrm{ZAMS}$ / L$_{☉}$)")
p.set_ylabel("Number of stars")
p.set(yscale="log")
```
%% Output
[None]
%% Cell type:markdown id:0546f2f3-4732-4841-8ef3-565fbf6b9961 tags:
This distribution is peaked at low luminosity, as one expects from observations, but the resolution is clearly not great because it's not smooth - it's spiky!
If you noticed above, the total probability of the grid was about 0.2. Given that the total probability of a probability distribution function should be 1.0, this shows that our sampling is (very) poor.
We could simply increase the resolution to compensate, but this is very CPU intensive and a complete waste of time and resources. Instead, let's try sampling the masses of the stars in a smarter way.
%% Cell type:markdown id:673031c9-7d80-45d4-b209-301c127d3edf tags:
## A better-sampled grid
The IMF has many more low-mass stars than high-mass stars. So, instead of sampling M1 linearly, we can sample it in log space.
To do this we first rename the mass grid variable so that it is clear we are working in (natural) logarithmic phase space.
%% Cell type:code id:5956f746-e3b9-4912-b75f-8eb0af66d3f6 tags:
``` python
# Rename the old variable (M_1) because we want it to be called lnM_1 now
population.rename_grid_variable("M_1","lnM_1")
population.rename_grid_variable("M_1", "lnM_1")
```
%% Output
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [27], in <cell line: 2>()
1 # Rename the old variable (M_1) because we want it to be called lnM_1 now
----> 2 population.rename_grid_variable("M_1","lnM_1")
File ~/.pyenv/versions/3.9.9/envs/dev_binarycpython3.9.9/lib/python3.9/site-packages/binarycpython/utils/population_extensions/gridcode.py:965, in gridcode.rename_grid_variable(self, oldname, newname)
963 else:
964 msg = "Failed to rename grid variable {} to {}.".format(oldname, newname)
--> 965 raise ValueError(msg)
ValueError: Failed to rename grid variable M_1 to lnM_1.
%% Cell type:markdown id:532f691c-c1f6-46cc-84f2-970ec1216e40 tags:
Next, we change the spacing function so that it works in the log space. We also adapt the probability calculation so that it calculates dprob/dlnM = M * dprob/dM. Finally, we set the precode to compute M_1 because binary_c requires the actual mass, not the logarithm of the mass.
%% Cell type:code id:108d470a-bb21-40b0-8387-2caa7ab0f923 tags:
``` python
# update the sampling, note that the IMF is dprob/dM1, and the phase
# space is now sampled in lnM1, so we multiply by M_1 to
# because M * dprob/dM = dprob/dlnM
population.update_grid_variable(
name="lnM_1",
samplerfunc="self.const_linear(math.log({min}), math.log({max}), {res})".format(min = massrange[0], max = massrange[1], res = resolution["M_1"]),
probdist="self.three_part_powerlaw(M_1, 0.1, 0.5, 1.0, 150, -1.3, -2.3, -2.3)*M_1",
dphasevol="dlnM_1",
parameter_name="M_1",
precode="M_1=math.exp(lnM_1)",
)
# print(population.grid_options["_grid_variables"]) # debugging
```
%% Cell type:code id:fb8db646-f3d0-4ccd-81ba-7fde23f29c79 tags:
``` python
# Clean and re-evolve the population
population.clean()
analytics = population.evolve()
# Show the results (debugging)
# print (population.grid_results)
```
%% Output
Do dry run? True
Doing dry run to calculate total starcount and probability
Grid has handled 39 stars with a total probability of 0.991317
**********************************
* Dry run *
* Total starcount is 39 *
* Total probability is 0.991317 *
**********************************
Do join of subprocesses ...
Joined subprocesses.
**********************************************************
* Population-ec5d853278c743a3acdb8ab290f641df finished! *
* The total probability is 0.991317. *
* It took a total of 1.01s to run 39 systems on 2 cores *
* = 2.02s of CPU time. *
* Maximum memory use 475.789 MB *
**********************************************************
No failed systems were found in this run.
Do analytics
Added analytics to metadata
%% Cell type:markdown id:182b1094-5057-4ccf-bac6-9b0e560ad4f6 tags:
You should see that the total probability is very close to 1.0, as you would expect for a well-sampled grid. The total will never be exactly 1.0, but that is because we are running a simulation, not a perfect copy of reality.
%% Cell type:code id:68ee1e56-21e5-48f4-b74c-50e48685ae94 tags:
``` python
# plot luminosity distribution
ldist = population.grid_results['luminosity distribution']
# pad the distribution with zeros where data is missing
pad_output_distribution(ldist,
binwidth['luminosity'])
# make pandas dataframe from our sorted dictionary of data
plot_data = pd.DataFrame.from_dict({'ZAMS luminosity distribution' : ldist})
# make the plot
p = sns.lineplot(data=plot_data)
p.set_xlabel("$\log_{10}$ ($L_\mathrm{ZAMS}$ / L$_{☉}$)")
p.set_ylabel("Number of stars")
p.set(yscale="log")
plt.show()
```
%% Output
%% Cell type:markdown id:31fe91bb-177d-4e4e-90cf-298a3f8a8b61 tags:
Most stars are low mass red dwarfs, with small luminosities. Without the IMF weighting, our model population would have got this completely wrong!
As you increase the resolution, you will see this curve becomes even smoother. The wiggles in the curve are (usually) sampling artefacts because the curve should monotonically brighten above about log(*L*/L<sub></sub>)=-2.
Remember you can play with the binwidth too. If you want a very accurate distribution you need a narrow binwidth, but then you'll also need high resolution (lots of stars) so lots of CPU time, hence cost, CO<sub>2</sub>, etc.
%% Cell type:markdown id:ba032bd8-b4a2-4558-9fd9-8e1e03d7d162 tags:
Things to try:
* Change the resolution to make the distributions smoother: what about error bars, how would you do that?
* Different initial distributions: the Kroupa distribution isn't the only one out there
* Change the metallicity and mass ranges
* What about a non-constant star formation rate? This is more of a challenge!
* What about evolved stars? Here we consider only the *zero-age* main sequnece. What about other main-sequence stars? What about stars in later phases of stellar evolution?
* Binary stars! (see notebook_luminosity_function_binaries.ipynb)
%% Cell type:code id:ef00d82c-dc36-4a94-b4be-98f58649644e tags:
``` python
```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment