The 10-step program
- Acknowledge your feelings
- Identify your skills and interests
- Decide what to do next
- Find a mentor
- Go public
- Form a study group
- Re-train and re-brand
- Settle in
- Mentor others
Let’s break these down.
Let’s break these down.
I decided to spend two weeks hopping from continent to continent to take part in back-to-back astro-statistics-tech events: the COIN Residency Program and AstroHackWeek. A year after having left the field, formally speaking, I’ve chosen to make astronomy my hobby, taking “leave” to do research. It’s maybe not entirely sensible, but I’m doing this on my own terms. This blog is a report on things I learned that sleep-deprived mostly-barefoot fortnight.
First, a little background about the events.
The Cosmostatistics Initiative (COIN) is a collaboration that began in 2014 as a section of the International Astronomical Association (IAA) and brings together people across the Astronomer–Statistician spectrum to do some left-of-field research introducing new data analytic, statistical, and visualisation techniques to the astronomy community. The Residence Program happens once a year: we hang out in an apartment for a week, do some intense work on 2-3 projects well into the wee hours, write-up half the papers, and still get some sun. This year we found ourselves in the lovely, warm, city of Budapest.
AstroHackWeek (AHW), on the other hand, is a free-form event with elements of a workshop (pre-defined lectures) and a lot more making-it-up-as-we-go-along. Early on, 50 participants suggest topics they would like to learn about, identify one expert amongst the group and allow them to become teacher for an hour to a class of 10-20 (learning collectives are a brilliant idea!). Hack projects are the highlight, and are proposed both before and throughout the event; many of us will work on 2-4 at once. AHW also started in 2014, and was held this year at the Berkeley Institute for Data Science (BIDS).
For completeness, I’m also going to mention dotAstronomy, a similar out-of-the-box unconference that started way back in 2008/9. It has evolved over the years, but by the time I attended dotAstro7 in Sydney in 2015, it had become a combination of idea-lectures, just one day of hack-projects, and a lot of unconference group discussions. More of the emphasis is on software/tech and education/communication.
OK, so here’s my brain-dump:
Mixture models are the result of combining models for different sub-populations or classes. This makes them relevant to both clustering classification routines and for dealing with outliers. You can never really tease the subpopulations apart; the point is to model the combined dataset. And maybe provide a probability for each data-point that it belongs to a specific class.
Some parameters of the model will be relevant to different subsets of the group. For example, for supernova data one needs to model individual light-curves (layer 1), properties of supernovae type Ia (layer 2), and cosmology (layer 3). I’m now convinced that at least half of all models are actually hierarchical, just not recognised and named as such.
Probabilistic Graphical Models (PGM) are diagrams that are very helpful for communicating parametrizations of models. You have to learn the “notation”, but once you do, they make great visual aids (see an example in this paper). Parameters are described as distributions, data or constants. Relationships between parameters are noted. This is particularly good for describing hierarchical models.
Making your covariance matrix Gaussian is the first step to modelling correlated errors. This is a complicated subject, and GPs certainly have limitations (maybe Gaussian isn’t appropriate!) but it’s better than just diagonal matrix, and besides, they have useful properties that make things easier to calculate.
This was the first time I actively used Jupyter Notebooks for writing python code, and I was pleasantly surprised by the interactive features and formatted commenting. Perfect for small pieces of code and teaching/demonstration. However, I do have some questions/gripes (please let me know if there are solutions) :
To be fair, I have an old version of ipython notebook, so maybe these gripes no longer apply. I should talk to the Jupyter crew, one of whom I met at AHW.
Parallel programming in Python
I had thought that parallel programming wasn’t really possible in python: you could run code on multiple threads yes, but not really multiple cores. People use multiprocessing sometimes, but now I need to look into mpipool. Could be useful, if you have the mpiexec job launcher set up on your cluster.
Natural Language Processing & Web-scraping
Despite being astronomers-by-trade, you’ll often find us talking excitedly about everything fascinating from outside our field. At a hack-week, we’re happy to give anything a shot. So after free dinner and drinks at GitHub HQ , we dreamt up the Happiness Hack (under a different name) and within 2 hours, created this.
It was going to end there, but the next day, we drummed up interest from the group and ended up extending the hack to grab** and analyse participants’ commit messages, as a bit of a joke, I guess, but here you go.
**beautiful-soup : holy crap!! So powerful, so beautiful…
Pair coding has been part of my life for the last few months, and I totally appreciate how it can really be more efficient despite the extra person investment. Just enough cooks. The small collaborations formed at both events worked wonderfully together, and several papers have been spawned. But really the big lesson, particularly from hacking at AHW, is that we benefit from learning to fail efficiently, because that sets us free to explore high risk projects. One person could hack away for weeks or months at an idea, while two or three people could declare it a lost cause in a mere day or two. Besides efficiency, this system prevents frustration and burn-out. Trying and failing was actively encouraged at AHW, and, better yet, demonstrated by senior participants.
Career transitions & Imposter Syndrome
Every time I meet with astronomers these days, the discussion turns to the process of leaving astronomy and imposter syndrome. The global community only really started talking about these on open forums about three years ago, and now it’s a recurring theme. At hack days/weeks, in particular, imposter syndrome is rife. Trying to prove your skills and worth and produce something spectacular on a short timescale is a recipe for mental health disaster. The pressure to dazzle with our hacking skillz certainly got to me back at dotAstro, but not as much this time, partly because the organisers made it a point to tackle the problem head-on (thank you!) and make the most of everyone’s diverse skill-sets, and partly because this time I knew better and put more emphasis on play and fun, and less on achieving goals.
So yeah, amongst the astronomy, statistics, computing, collaborating, hacking, and playing, I managed to learn a ton of stuff, see lovely places, and make new friends, which made the trip very worthwhile. My most important lesson, however, was:
Try not to doze off while on your laptop on the sofa near your colleagues, otherwise you end up with photos of creepy teddy bears watching you sleeping…
I had the privilege of attending the 2016 Australian Academy of Science Theo Murphy High Flyers Think Tank in Canberra just recently. I’d only heard about it via a single tweet the day before applications were due, but with the topic of “An interdisciplinary approach to living in a risky world”, my response was: yes please.
We were also asked to choose our preferred topic for breakout-group discussion, and I got my obvious favourite, the technical theme of “Uncertainty, ignorance and partial knowledge”, which turned out to have some focus on decision theory. The session would chaired by Prof. Mark Colyvan, a professor of Philosophy at my alma mater, The University of Sydney, who had recently responded to Luke Barnes’s recent fine-tuning of the universe talk. Some of the recommended reading got me thinking about matters we didn’t get to cover (like how much I don’t like maximin), but I’ll discuss with Mark, and I’m sure I’ll blog about that later. In the meantime, our breakout group spent a couple of hours throwing around our thoughts and ideas and have begun to craft a report and recommendations for the Academy regarding decision-making and risk communication in the face of uncertainty.
My fellow delegates were such interesting people from diverse backgrounds like health, maths, stats, philosophy, history, law, geology, ecology, microbiology etc, and absorbing ideas from these amazing people over the two days provided a complete mental recharge. It was like NYSF for grown-ups. Even the conference dinner speech by emergency doctor David Caldicott was so stimulating, leaving my laughing and crying, I’d dare say it was the “best event speech ever”.
Actually one of the things I most enjoyed at the Think Tank was finding out people’s thoughts on rationality during tea break, as always. As it turns out, most people I spoke to (about this topic, sample size ~5) were adamant that people are at heart, irrational creatures. Only one person (besides myself) thought otherwise. I’ve been told I have to read Daniel Kahneman’s Thinking Fast and Slow to hear more arguments against the assumption of rationality. Apparently there are tests for this sort of thing…
I confess: I like Tom Stoppard because his plays highlight all the intellectually stimulating but somewhat pretentious (aren’t they all?) discussions I’ve had over the last 15 years. His latest, The Hard Problem, was no different. It follows Hilary, a psychology student who we meet as she applies for a job at the Krohl Institute for Brain Science, hoping to inject some humanity into their research. As always, Stoppard treats us to some witty banter, this time about altruism, animal behaviour, coincidence, consciousness, ego, evolutionary biology, morality, neuroscience, religion, and the worlds of academia and finance. The Hard Problem is perhaps less clever and fresh than Arcadia or RosenGuild, but fun and thought-provoking nonetheless. Some of the characters are true to the bone while others, disappointingly, feel typecast, but there is definitely some familiar truth in all. Overall, I’m pretty happy with the brain-lit Hytner production that we saw streamed live from the National Theatre in London – worth seeing.
Recently I attended the second ever Bayesian Young Statisticians’ Meeting (BAYSM`14) in Vienna, which was a really stimulating experience, and something pretty new for me, being my first non-astronomy conference. I won a prize for my talk too, which was pretty sweet!
During the two-day overview of theory and a variety of applications by the newest people in the field (read about the highlights over at the blogs of Ewan Cameron and Christian Robert), we heard from a few Keynote Speakers including Chris Holmes. In his talk, he mentioned the world of rational decision makers as envisioned by Leonard J. Savage in his 1954/1972 tome The Foundations of Statistics (adding that on my ‘to read’ list), and went on to describe the application of a loss function and minimax to avoid worst-case scenarios. Minimax isn’t the only approach to decision-making; I think other approaches are more relevant to our behaviour, as I’ll describe later.
“If you lived your life according to minimax, you’d never get out of bed” – C. Holmes
Children are very good at science. They start with broad priors (anything is possible) and learn through collecting data (see picture below) what conclusions are supported best by the evidence. They experiment, make mistakes, and test the variations on a theme. They learn what is dangerous; they learn what is tasty; they learn how to speak.
Our responses to experiences are very similar to Bayesian reasoning. Take trust as an example. If some dudette off the street – let’s call her Margaret – were to recommend a movie, say Moon, we might not heed her words since we have no reason to think we’d have the same taste in movies as her, but if upon watching Moon we found that we quite enjoyed it – we’d be more likely to rely on Margaret’s next tip, say Wadjda. And if Wadjda was also to our liking, we’d probably trust Margaret’s advice when she suggests Fast & Furious 6 (oops). But that blunder would reduce our confidence in her next recommendation, etc. If we define our experience of the movie in binary terms such as “liked” and “disliked”, the situation resembles the classic coin-toss experiment in which one tries to determine if a coin is biased by flipping it many times.