Conjoint designs with restrictions in JavaScript
This post shows how you can create a choice-based conjoint survey experiment that includes restricted attribute combinations in JavaScript.
What are conjoint experiments?
Conjoint experiments are widely used in marketing and the social sciences to study people’s attitudes and decision-making. Their big advantage is that they allow analysts to study how people evaluate some object of interest and how these evaluations depend causally on specific attributes of that object. For example, a market researcher might be interested in how consumers evaluate a car based on its fuel efficiency, transmission, price tag, and its expected resale value, or a political scientist might want to see how voters evaluate presidential candidates based on their gender, level of education, profession, or age.
A typical conjoint experiment to study these questions, for example how consumers choose cars, would look as follows: The analyst would first define a number of attributes that she is interested as well as “levels” for each attribute:
- Price: 10'000,- USD; 20'000,- USD; 50'000,- USD; 80'000,- USD;
- Miles per gallon: 10; 15; 20; 25;
- Transmission: Manual; automatic;
- Resale value: 5'000,- USD; 15'000,- USD; 50'000,- USD;
Then they would create a survey in which respondents are presented with two fictional cars which have each some random combination of the attributes listed above:
Respondents would then be asked to choose between the two cars and/or rate each on some scale (e.g., probability of purchase). An alternative would to show respondents only one car that they should evaluate, or a short text (“vignette”) describing the car instead of a table.
It is the random combination of attributes that is crucial and makes conjoints so powerful because this means that analysts can later use the data from many different choice tasks to estimate the unbiased causal effect of a given attribute (e.g., a price tag of 50'000,-) on people’s decision, independent of all other attributes.
In many cases, however, there are some attribute combinations that can be (seem) implausible and which should therefore be excluded. In the current example, this would be a car that has a price of 10'000 or 20'000 USD, but an expected resale value of 50'000.
Creating conjoint designs with constraits in JavaScript
My colleagues and I have been working quite a bit with conjoint designs (or their sibling, fractional factorial vignette designs) recently, for instance to study how people would allocate ICU beds to COVID-19 patients. But while we took charge of almost the entire design and analysis process ourselves, we relied on external providers and/or proprietary software for the implementation of the conjoint task in our surveys, which is not ideal. I therefore wanted to learn how to implement a conjoint experiment myself using open-source tools — ideally using JavaScript because this would allow me to directly embedd the experiment within an online survey questionnaire.
One excellent instruction (on which I draw heavily here) got me part of the way, but I still needed to adjust the example code in that instruction to be able to exclude certain combinations.
In what follows, I use my code to replicate the conjoint design from the Hidden American Consensus study on immigration policy attitudes by Hainmueller et al. using JavaScript. This code can obviously be adapted to generate other designs.
Step #1 is to define arrays containing all the different attribute levels:
// Arrays containing all attribute levels:
var EducationArray = ["High school","No formal","Graduate degree","4th Grade","College degree","Two-year college","8th grade"];
var GenderArray = ["Male","Female"];
var OriginArray = ["Iraq","France","Sudan","Germany","Philippines","Poland","Mexico","Somalia","China","India"];
var Reason1Array = ["Seek better job","Escape persecution","Reunite with family"];
var Reason2Array = ["Seek better job","Reunite with family"];
var Job1Array = ["Nurse","Child care provider","Gardener","Construction worker","Teacher","Janitor","Waiter","Doctor","Financial analyst","Computer programmer","Research scientist"];
var Job2Array = ["Nurse","Child care provider","Gardener","Construction worker","Janitor","Waiter"];
var PlansArray = ["Contract with employer","Interviews with employer","Will look for work","No plans to look for work"];
var PriorArray = ["Once as tourist","Once w/o authorization","Never","Many times as tourist","Six months with family"];
var LanguageArray = ["Tried English, but unable","Used interpreter","Fluent English","Broken English"]
var ExpArray = ["None","1-2 years","3-5 years","5+ years"]
Note that there are two versions of the Reason and Job arrays. This is because these two attributes are subject to restrictions: The reason for immigration to the US can only be “Escape persecution” if the immigrant in question comes from Iraq, Sudan, or Somalia, and their job can only by doctor, teacher, financial analyst, computer programmer, or research scientist if they have at least some college education.
The next step is to create functions that shuffle (randomize) the attributes (this is taken from Matthew Graham’s instructions):
// Fisher -Yates shuffle:
function shuffle(array){
for (var i = array.length - 1; i > 0; i--){
var j = Math.floor(Math.random() * (i + 1));
var temp = array[i];
array[i] = array[j];
array[j] = temp; }
return array;
}
// Shuffle a vector, choose the first entry:
function shuffle_one(theArray){
var out = shuffle(theArray);
var out = out[0];
return(out);
}
Now comes the important part: Defining a function that builds profile pairs, while excluding certain combinations. This is what the following code does:
// Builds profile pair & accounts for implausible combinations
//////////////////////////////////////////////////////////////
function genprof(){
// Profile "rump"
var sw1 = [1,shuffle_one(EducationArray),shuffle_one(GenderArray),shuffle_one(OriginArray),shuffle_one(ExpArray),shuffle_one(PlansArray),shuffle_one(PriorArray),shuffle_one(LanguageArray)];
var sw2 = [2,shuffle_one(EducationArray),shuffle_one(GenderArray),shuffle_one(OriginArray),shuffle_one(ExpArray),shuffle_one(PlansArray),shuffle_one(PriorArray),shuffle_one(LanguageArray)];
// Exclusion of reason, conditional on origin
if (sw1.includes("Iraq") || sw1.includes("Somalia") || sw1.includes("Sudan")) {
sw1_end = [shuffle_one(Reason1Array)];
sw1.push(sw1_end);
} else {
sw1_end = [shuffle_one(Reason2Array)];
sw1.push(sw1_end)
}
if (sw2.includes("Iraq") || sw2.includes("Somalia") || sw2.includes("Sudan")) {
sw2_end = [shuffle_one(Reason1Array)];
sw2.push(sw2_end);
} else {
sw2_end = [shuffle_one(Reason2Array)];
sw2.push(sw2_end)
};
// Exclusion of job, conditional on education
if (sw1.includes("No formal") || sw1.includes("4th grade") || sw1.includes("8th grade") || sw1.includes("High school")) {
sw1_end = [shuffle_one(Job2Array)];
sw1.push(sw1_end);
} else{
sw1_end = [shuffle_one(Job1Array)];
sw1.push(sw1_end);
};
if (sw2.includes("No formal") || sw2.includes("4th grade") || sw2.includes("8th grade") || sw2.includes("High school")) {
sw2_end = [shuffle_one(Job2Array)];
sw2.push(sw2_end);
} else{
sw2_end = [shuffle_one(Job1Array)];
sw2.push(sw2_end);
}
// Building profiles
var profiles = []
profiles.push(sw1);
profiles.push(sw2);
return(profiles);
};
The first part creates a “rump” profile with all the unrestricted attributes. Following this come two if-else conditions that generate values for the Job and Reason attributes conditional on the values that were generated previously. The last part builds the profiles.
I also wanted to make sure that the two profiles shown are never exactly identical (a remote possibility, but still…) so I created a recursive function to take care of that:
// checks for equality
function checker(){
test = genprof();
if (test[0]==test[1]){
console.log("Identical profiles, doing over!");
test = genprof();
checker();
return(test);
} else {
return(test);
}
};
Finally, the last part sets the number of choice tasks to be generated (I set this to around 13'000, which is similar to the Hainmueller et al. data) and then define a loop to generate the profiles and combine them into a joint array. That array gets then “flattened” for easier handling later.
// Number of choice tasks
var ncomps = 13080;
// combine profiles to "deck" seen by individual respondent (necessary when using this code within an online survey)
deck = []; // captures profile deck
for (let i = 0; i <= ncomps; i++){
x = checker();
deck.push(x);
};
// flatten array to "table"
deck = deck.flat();
This code so far can be used to generate and populate a conjoint design within an online questionnaire on the fly (the number of choice tasks would then of course be reduced to something more manageable for a single respondent, say 5).
In this case, I wanted to create an entire “database” of profiles to see if the code I wrote successfully generates a similar distribution of attributes as the Hainmueller et al. design. (Exporting a design is sometimes also necessary when these are to be shared with survey platform providers.)
The following code (based on this post on stackoverflow) does that:
// EXPORT TO CSV
////////////////
// adds dim names
deck.unshift(["ProfileNo","Education","Gender","Origin","Experience","Plans","Prior","Language","Reason","Job"]);
// Convert to CSV string
function arrayToCsv(data){
return data.map(row =>
row
.map(String) // convert every value to String
.map(v => v.replaceAll('"', '""')) // escape double colons
.map(v => `"${v}"`) // quote it
.join(',') // comma-separated
).join('\r\n'); // rows starting on new lines
};
file = arrayToCsv(deck);
// Download contents as a file
function downloadBlob(content, filename, contentType) {
// Create a blob
var blob = new Blob([content], { type: contentType });
var url = URL.createObjectURL(blob);
// Create a link to download it
var pom = document.createElement('a');
pom.href = url;
pom.setAttribute('download', filename);
pom.click();
};
downloadBlob(file, 'cjoint_profiles.csv', 'text/csv;charset=utf-8;');
The entire JavaScript code file (here called cjoint_gen.js) can now be embedded into a simple HTML file that can be run in any browser to automatically generate and export the profiles as a CSV file:
<!DOCTYPE html>
<html>
<head><title>Generate conjoint profiles</title></head>
<body>
<div id="test">Your profiles have been created; please check your Downloads folder for a CSV file called "cjoint_profiles.csv".</div>
<script src="cjoint_gen.js"></script>
</body>
</html>
Does it work?
I generated a batch of profiles and then used R
to see if the design works as it should. Reading in the data is obviously the first step:
library(tidyverse)
profiles <- read.csv("cjoint_profiles.csv")
profiles %>%
mutate(across(Education:Language, as.factor)) -> profiles
Based on the design choices (and previous results), there should be fewer profiles that had “Escape persecution” as their reason for migration and fewer profiles with high-skilled jobs — because these attributes could only appear if certain other attributes were present. To see if that is also the case here, I created a simple (and not very polished) graph that shows the distribution of each attribute in the entire sample of profiles:
as.data.frame(unlist(apply(profiles, 2, function(x){
prop.table(table(x))*100
}))) %>%
rename(perc = starts_with("unlist")) %>%
rownames_to_column(var = "attribute") %>%
ggplot(aes(y = attribute, x = perc)) +
geom_bar(stat = "identity") +
labs(x = "Percent", y = "") +
theme_classic()
The result looks like this:
There are indeed fewer profiles that seek to escape persecution or who have high-skilled professions.
A second important thing to check is of course if the randomization worked — if the attributes are really independent of each other (except for those that are restricted). To check this, I estimate chi-squared tests for each attribute pair and visualize the resulting p-values:
profiles %>%
select(-ProfileNo) -> profiles2
as.data.frame(apply(profiles2,2,function(x){
apply(profiles2,2,function(y){
res <- chisq.test(table(y,x))
return(res$p.value)
})
})) %>%
rownames_to_column(var = "dimension1") %>%
pivot_longer(cols = -starts_with("dimension"),
values_to = "pval",
names_to = "dimension2") %>%
as.data.frame(rando) %>%
ggplot(aes(x = dimension1, y = dimension2, fill = pval)) +
geom_tile() +
geom_text(aes(label = format.pval(pval, eps = 0.01, digits = 2))) +
scale_fill_gradient(low="red", high="white") +
labs(x = "", y = "",
fill = "p-value") +
theme_classic() +
theme(legend.position = "bottom")
The result is reassuring: Almost only those attribute pairs that have excluded combinations are associated with each other, but Reason and Language get close and there is a significant association between Plans and Language.
That there are unexpected associations between attributes that should be uncorrelated is not a large problem. For one, these might simply be false positives (to be expected when running 9x9=81 statistical tests) and, for comparison, the original Hainmueller et al. data (retrieved from the AJPS dataverse) also contain some associations (which might also be false positives):
hain <- labelled::unlabelled(haven::read_dta("immigrant.dta"))
hain %>%
select(starts_with("Feat")) -> profiles
as.data.frame(apply(profiles,2,function(x){
apply(profiles,2,function(y){
res <- chisq.test(table(y,x))
return(res$p.value)
})
})) %>%
rownames_to_column(var = "dimension1") %>%
pivot_longer(cols = -starts_with("dimension"),
values_to = "pval",
names_to = "dimension2") %>%
as.data.frame(rando) %>%
ggplot(aes(x = dimension1, y = dimension2, fill = pval)) +
geom_tile() +
geom_text(aes(label = format.pval(pval, eps = 0.01, digits = 2))) +
scale_fill_gradient(low="red", high="white") +
labs(x = "", y = "",
fill = "p-value") +
theme_classic() +
theme(legend.position = "bottom")