There are different anwsers depending on the more specific form of the question which is raised. Are you asking [A] where the formula of the skew-normal (SN) distribution appeared first, or [B] where the formulation of this extension of the normal class of distributions appeared first?

**Answer to question A**- The SN distribution can be generated via various forms of manipulation of
normal variates. As a consequence, it is not surprising that a number of
authors who were working on some problem involving normal variables ended up
with some expression equal or equivalent to the SN distribution, even if their
purpose was not really to build an extension of the normal class. The first
occurrence of this type appears to be:
`Birnbaum, Z. W. (1950). Effect of linear truncation on a multinormal population.`*Ann. Math. Statist.*21, 272-279.There are however many other occurrences; the list includes:

- a discussion with various contributors, summarized by Nelson (1964, Technometrics 6, 469-71);
- Roberts (1966, J. Am. Statist. Assoc. 61, 1184-90);
- O'Hagan & Leonard (1976, Biometrika, 63, 201-202);
- Aigner
*et al.*(1977, J. Econometrics, 12, 21-37).

**Answer to question B**- Althought the specific mathematical outcome is a somewhat different
type of distribution, the first idea of extending the normal class of
distributions in a constructive formulation via a population selection
mechanism appears to be the one of
whose importance and novelty is discussed in a paper by Azzalini & Regoli (2012).

As for the specific SN formulation, see Azzalini (1985, Scand. J. Statist 12, 171-178) for the scalar case, and Azzalini & Dalla Valle (1996, Biometrika, 83, 715-726) for the multivariate case.