Drug consumption, collected online in 2011-2012
The database contains records for 1885 respondents. For each respondent 12 attributes are known: personality measurements which include NEO-FFI-R (neuroticism, extraversion, openness to experience, agreeableness, and conscientiousness), BIS-11 (impulsivity), and ImpSS (sensation seeking), level of education, age, gender, country of residence and ethnicity. In addition, participants were questioned concerning their use of 18 legal and illegal drugs (alcohol, amphetamines, amyl nitrite, benzodiazepines, cannabis, chocolate, cocaine, caffeine, crack, ecstasy, heroin, ketamine, legal highs, LSD, methadone, mushrooms, nicotine and volatile substance abuse and one fictitious drug (Semeron) which was introduced to identify over-claimers. For each drug they selected either never used the drug, used it over a decade ago, or in the last decade, year, month, week, or day.
The database
contains 18 classification problems. Each of the independent label variables
contains seven classes: ‘Never Used’, ‘Used over a Decade Ago’, ‘Used in Last
Decade’, ‘Used in Last Year’, ‘Used in Last Month’, ‘Used in Last Week’, and
‘Used in Last Day’.
Two versions of database is presented: original database with nominal input features and quantified database with numerical attributes.
• Seven class classifications for each drug separately.
• Problems can be transformed to binary classification by union of part of classes into one new class. For example, ‘Never Used’, ‘Used over a Decade Ago’ form class ‘Non-user’ and all other classes form class ‘User’.
• The best binarization of classes for each attribute.
• Evaluation of risk to be drug consumer for specific drug.
The detailed description of the database is presented in:
1. Fehrman, E., Egan, V., Gorban, A.N., Levesley, J., Mirkes, E.M., Muhammad, A.K. Personality Traits and Drug Consumption: The Story Told by Data, https://www.springer.com/gp/book/9783030104412
2. Fehrman, E., Muhammad, A.K., Mirkes, E.M., Egan, V. and Gorban, A.N., 2017. The Five Factor Model of personality and evaluation of drug consumption risk. In Data Science (pp. 231-242). Springer, Cham.
3. E. Fehrman, A. K. Muhammad, E. M. Mirkes, V. Egan and A. N. Gorban, “The Five Factor Model of personality and evaluation of drug consumption risk”, arXiv preprint arXiv:1506.06297, 2015.
Successful classifiers have been created for all drugs, thus
providing the possibility of evaluating individuals for the risk of drug
consumption. For most drugs
sensitivity and specificity are greater than 75%.