Analyze user trends of one of Bellabeat’s products and the smart device data in order to gain actionable insight that could provide data-driven marketing strategies to unlock the potential growth of Bellabeat.
Urška Sršen - Bellabeat’s Cofounder & CCO
Sando Mur - Mathematician and Bellabeat’s cofounder; key member of the Bellabeat executive team
Bellabeat marketing analytics team - A team of data analysts responsible for collecting, analyzing, and reporting data that helps guide Bellabeat’s marketing strategy.
What are some trends in smart device usage?
How could these trends apply to Bellabeat customers?
How could these trends help influence Bellabeat marketing strategy?
install.packages("tidyverse", quiet = T)
## package 'tidyverse' successfully unpacked and MD5 sums checked
install.packages("skimr", quiet = T)
## package 'skimr' successfully unpacked and MD5 sums checked
install.packages("janitor", quiet = T)
## package 'janitor' successfully unpacked and MD5 sums checked
install.packages("dplyr", quiet = T)
## package 'dplyr' successfully unpacked and MD5 sums checked
install.packages("sqldf", quiet = T)
## package 'sqldf' successfully unpacked and MD5 sums checked
install.packages("reshape2", quiet = T)
## package 'reshape2' successfully unpacked and MD5 sums checked
library("tidyverse", quietly = T)
library("skimr", quietly = T)
library("janitor", quietly = T)
library("dplyr", quietly = T)
library("sqldf", quietly = T)
library("reshape2", quietly = T)
setwd("/Users/Jeong Park/OneDrive/Documents/Fitabase Data 2016.04.12 - 2016.05.12")
daily_Activity <- read_csv("dailyActivity_merged.csv", show_col_types = FALSE)
daily_Calories <- read_csv("dailyCalories_merged.csv", show_col_types = FALSE)
daily_Intensities <- read_csv("dailyIntensities_merged.csv", show_col_types = FALSE)
daily_Steps <- read_csv("dailySteps_merged.csv", show_col_types = FALSE)
sleep_Day <- read_csv("sleepDay_merged.csv", show_col_types = FALSE)
weight_Loginfo <- read_csv("weightLogInfo_merged.csv", show_col_types = FALSE)
head(daily_Activity)
## # A tibble: 6 x 15
## Id ActivityDate TotalSteps TotalDistance TrackerDistance LoggedActivitie~
## <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 1.50e9 4/12/2016 13162 8.5 8.5 0
## 2 1.50e9 4/13/2016 10735 6.97 6.97 0
## 3 1.50e9 4/14/2016 10460 6.74 6.74 0
## 4 1.50e9 4/15/2016 9762 6.28 6.28 0
## 5 1.50e9 4/16/2016 12669 8.16 8.16 0
## 6 1.50e9 4/17/2016 9705 6.48 6.48 0
## # ... with 9 more variables: VeryActiveDistance <dbl>,
## # ModeratelyActiveDistance <dbl>, LightActiveDistance <dbl>,
## # SedentaryActiveDistance <dbl>, VeryActiveMinutes <dbl>,
## # FairlyActiveMinutes <dbl>, LightlyActiveMinutes <dbl>,
## # SedentaryMinutes <dbl>, Calories <dbl>
glimpse(daily_Activity)
## Rows: 940
## Columns: 15
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 150396036~
## $ ActivityDate <chr> "4/12/2016", "4/13/2016", "4/14/2016", "4/15/~
## $ TotalSteps <dbl> 13162, 10735, 10460, 9762, 12669, 9705, 13019~
## $ TotalDistance <dbl> 8.50, 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.8~
## $ TrackerDistance <dbl> 8.50, 6.97, 6.74, 6.28, 8.16, 6.48, 8.59, 9.8~
## $ LoggedActivitiesDistance <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ~
## $ VeryActiveDistance <dbl> 1.88, 1.57, 2.44, 2.14, 2.71, 3.19, 3.25, 3.5~
## $ ModeratelyActiveDistance <dbl> 0.55, 0.69, 0.40, 1.26, 0.41, 0.78, 0.64, 1.3~
## $ LightActiveDistance <dbl> 6.06, 4.71, 3.91, 2.83, 5.04, 2.51, 4.71, 5.0~
## $ SedentaryActiveDistance <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ~
## $ VeryActiveMinutes <dbl> 25, 21, 30, 29, 36, 38, 42, 50, 28, 19, 66, 4~
## $ FairlyActiveMinutes <dbl> 13, 19, 11, 34, 10, 20, 16, 31, 12, 8, 27, 21~
## $ LightlyActiveMinutes <dbl> 328, 217, 181, 209, 221, 164, 233, 264, 205, ~
## $ SedentaryMinutes <dbl> 728, 776, 1218, 726, 773, 539, 1149, 775, 818~
## $ Calories <dbl> 1985, 1797, 1776, 1745, 1863, 1728, 1921, 203~
head(daily_Calories)
## # A tibble: 6 x 3
## Id ActivityDay Calories
## <dbl> <chr> <dbl>
## 1 1503960366 4/12/2016 1985
## 2 1503960366 4/13/2016 1797
## 3 1503960366 4/14/2016 1776
## 4 1503960366 4/15/2016 1745
## 5 1503960366 4/16/2016 1863
## 6 1503960366 4/17/2016 1728
glimpse(daily_Calories)
## Rows: 940
## Columns: 3
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 1503960366~
## $ ActivityDay <chr> "4/12/2016", "4/13/2016", "4/14/2016", "4/15/2016", "4/16/~
## $ Calories <dbl> 1985, 1797, 1776, 1745, 1863, 1728, 1921, 2035, 1786, 1775~
head(daily_Intensities)
## # A tibble: 6 x 10
## Id ActivityDay SedentaryMinutes LightlyActiveMinutes FairlyActiveMinu~
## <dbl> <chr> <dbl> <dbl> <dbl>
## 1 1503960366 4/12/2016 728 328 13
## 2 1503960366 4/13/2016 776 217 19
## 3 1503960366 4/14/2016 1218 181 11
## 4 1503960366 4/15/2016 726 209 34
## 5 1503960366 4/16/2016 773 221 10
## 6 1503960366 4/17/2016 539 164 20
## # ... with 5 more variables: VeryActiveMinutes <dbl>,
## # SedentaryActiveDistance <dbl>, LightActiveDistance <dbl>,
## # ModeratelyActiveDistance <dbl>, VeryActiveDistance <dbl>
glimpse(daily_Intensities)
## Rows: 940
## Columns: 10
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 150396036~
## $ ActivityDay <chr> "4/12/2016", "4/13/2016", "4/14/2016", "4/15/~
## $ SedentaryMinutes <dbl> 728, 776, 1218, 726, 773, 539, 1149, 775, 818~
## $ LightlyActiveMinutes <dbl> 328, 217, 181, 209, 221, 164, 233, 264, 205, ~
## $ FairlyActiveMinutes <dbl> 13, 19, 11, 34, 10, 20, 16, 31, 12, 8, 27, 21~
## $ VeryActiveMinutes <dbl> 25, 21, 30, 29, 36, 38, 42, 50, 28, 19, 66, 4~
## $ SedentaryActiveDistance <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ~
## $ LightActiveDistance <dbl> 6.06, 4.71, 3.91, 2.83, 5.04, 2.51, 4.71, 5.0~
## $ ModeratelyActiveDistance <dbl> 0.55, 0.69, 0.40, 1.26, 0.41, 0.78, 0.64, 1.3~
## $ VeryActiveDistance <dbl> 1.88, 1.57, 2.44, 2.14, 2.71, 3.19, 3.25, 3.5~
head(daily_Steps)
## # A tibble: 6 x 3
## Id ActivityDay StepTotal
## <dbl> <chr> <dbl>
## 1 1503960366 4/12/2016 13162
## 2 1503960366 4/13/2016 10735
## 3 1503960366 4/14/2016 10460
## 4 1503960366 4/15/2016 9762
## 5 1503960366 4/16/2016 12669
## 6 1503960366 4/17/2016 9705
glimpse(daily_Steps)
## Rows: 940
## Columns: 3
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 1503960366~
## $ ActivityDay <chr> "4/12/2016", "4/13/2016", "4/14/2016", "4/15/2016", "4/16/~
## $ StepTotal <dbl> 13162, 10735, 10460, 9762, 12669, 9705, 13019, 15506, 1054~
head(sleep_Day)
## # A tibble: 6 x 5
## Id SleepDay TotalSleepRecords TotalMinutesAsleep TotalTimeInBed
## <dbl> <chr> <dbl> <dbl> <dbl>
## 1 1503960366 4/12/2016 1 327 346
## 2 1503960366 4/13/2016 2 384 407
## 3 1503960366 4/15/2016 1 412 442
## 4 1503960366 4/16/2016 2 340 367
## 5 1503960366 4/17/2016 1 700 712
## 6 1503960366 4/19/2016 1 304 320
glimpse(sleep_Day)
## Rows: 413
## Columns: 5
## $ Id <dbl> 1503960366, 1503960366, 1503960366, 1503960366, 150~
## $ SleepDay <chr> "4/12/2016", "4/13/2016", "4/15/2016", "4/16/2016",~
## $ TotalSleepRecords <dbl> 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ~
## $ TotalMinutesAsleep <dbl> 327, 384, 412, 340, 700, 304, 360, 325, 361, 430, 2~
## $ TotalTimeInBed <dbl> 346, 407, 442, 367, 712, 320, 377, 364, 384, 449, 3~
head(weight_Loginfo)
## # A tibble: 6 x 8
## Id Date WeightKg WeightPounds Fat BMI IsManualReport LogId
## <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <lgl> <dbl>
## 1 1503960366 5/2/2016 52.6 116. 22 22.6 TRUE 1.46e12
## 2 1503960366 5/3/2016 52.6 116. NA 22.6 TRUE 1.46e12
## 3 1927972279 4/13/2016 134. 294. NA 47.5 FALSE 1.46e12
## 4 2873212765 4/21/2016 56.7 125. NA 21.5 TRUE 1.46e12
## 5 2873212765 5/12/2016 57.3 126. NA 21.7 TRUE 1.46e12
## 6 4319703577 4/17/2016 72.4 160. 25 27.5 TRUE 1.46e12
glimpse(weight_Loginfo)
## Rows: 67
## Columns: 8
## $ Id <dbl> 1503960366, 1503960366, 1927972279, 2873212765, 2873212~
## $ Date <chr> "5/2/2016", "5/3/2016", "4/13/2016", "4/21/2016", "5/12~
## $ WeightKg <dbl> 52.6, 52.6, 133.5, 56.7, 57.3, 72.4, 72.3, 69.7, 70.3, ~
## $ WeightPounds <dbl> 115.9631, 115.9631, 294.3171, 125.0021, 126.3249, 159.6~
## $ Fat <dbl> 22, NA, NA, NA, NA, 25, NA, NA, NA, NA, NA, NA, NA, NA,~
## $ BMI <dbl> 22.65, 22.65, 47.54, 21.45, 21.69, 27.45, 27.38, 27.25,~
## $ IsManualReport <lgl> TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, ~
## $ LogId <dbl> 1.46223e+12, 1.46232e+12, 1.46051e+12, 1.46128e+12, 1.4~
daily_Activity <- daily_Activity %>%
rename(Date = ActivityDate)
daily_Steps <- daily_Steps %>%
rename(TotalSteps = StepTotal)
sleep_Day <- sleep_Day %>% rename(Date = SleepDay)
join_1 <- inner_join(daily_Activity, daily_Calories, by = c('Id', 'Calories'))
## Warning in inner_join(daily_Activity, daily_Calories, by = c("Id", "Calories")): Each row in `x` is expected to match at most 1 row in `y`.
## i Row 23 of `x` matches multiple rows.
## i If multiple matches are expected, set `multiple = "all"` to silence this
## warning.
join_2 <- inner_join(daily_Activity, daily_Intensities)
## Joining with `by = join_by(Id, VeryActiveDistance, ModeratelyActiveDistance,
## LightActiveDistance, SedentaryActiveDistance, VeryActiveMinutes,
## FairlyActiveMinutes, LightlyActiveMinutes, SedentaryMinutes)`
## Warning in inner_join(daily_Activity, daily_Intensities): Each row in `x` is expected to match at most 1 row in `y`.
## i Row 105 of `x` matches multiple rows.
## i If multiple matches are expected, set `multiple = "all"` to silence this
## warning.
join_3 <- inner_join(daily_Activity, daily_Steps)
## Joining with `by = join_by(Id, TotalSteps)`
## Warning in inner_join(daily_Activity, daily_Steps): Each row in `x` is expected to match at most 1 row in `y`.
## i Row 105 of `x` matches multiple rows.
## i If multiple matches are expected, set `multiple = "all"` to silence this
## warning.
join_4 <- inner_join(daily_Activity, sleep_Day)
## Joining with `by = join_by(Id, Date)`
## Warning in inner_join(daily_Activity, sleep_Day): Each row in `x` is expected to match at most 1 row in `y`.
## i Row 436 of `x` matches multiple rows.
## i If multiple matches are expected, set `multiple = "all"` to silence this
## warning.
merge_A <- merge(join_1, join_2)
merge_B <- merge(join_3, join_4)
daily_userData <- merge(merge_A, merge_B) %>% select(-ActivityDay)
head(daily_userData)
## Id Date TotalSteps TotalDistance TrackerDistance
## 1 1503960366 4/12/2016 13162 8.50 8.50
## 2 1503960366 4/13/2016 10735 6.97 6.97
## 3 1503960366 4/15/2016 9762 6.28 6.28
## 4 1503960366 4/16/2016 12669 8.16 8.16
## 5 1503960366 4/17/2016 9705 6.48 6.48
## 6 1503960366 4/19/2016 15506 9.88 9.88
## LoggedActivitiesDistance VeryActiveDistance ModeratelyActiveDistance
## 1 0 1.88 0.55
## 2 0 1.57 0.69
## 3 0 2.14 1.26
## 4 0 2.71 0.41
## 5 0 3.19 0.78
## 6 0 3.53 1.32
## LightActiveDistance SedentaryActiveDistance VeryActiveMinutes
## 1 6.06 0 25
## 2 4.71 0 21
## 3 2.83 0 29
## 4 5.04 0 36
## 5 2.51 0 38
## 6 5.03 0 50
## FairlyActiveMinutes LightlyActiveMinutes SedentaryMinutes Calories
## 1 13 328 728 1985
## 2 19 217 776 1797
## 3 34 209 726 1745
## 4 10 221 773 1863
## 5 20 164 539 1728
## 6 31 264 775 2035
## TotalSleepRecords TotalMinutesAsleep TotalTimeInBed
## 1 1 327 346
## 2 2 384 407
## 3 1 412 442
## 4 2 340 367
## 5 1 700 712
## 6 1 304 320
cbind(lapply(lapply(daily_userData, is.na),sum))
## [,1]
## Id 0
## Date 0
## TotalSteps 0
## TotalDistance 0
## TrackerDistance 0
## LoggedActivitiesDistance 0
## VeryActiveDistance 0
## ModeratelyActiveDistance 0
## LightActiveDistance 0
## SedentaryActiveDistance 0
## VeryActiveMinutes 0
## FairlyActiveMinutes 0
## LightlyActiveMinutes 0
## SedentaryMinutes 0
## Calories 0
## TotalSleepRecords 0
## TotalMinutesAsleep 0
## TotalTimeInBed 0
sum(is.na(weight_Loginfo))
## [1] 65
n_distinct(weight_Loginfo$Id)
## [1] 8
nrow(weight_Loginfo)
## [1] 67
summarize(daily_userData)
## data frame with 0 columns and 1 row
sqldf('SELECT Id, Date, Calories, TotalSteps, TotalDistance, TrackerDistance, LoggedActivitiesDistance,
(VeryActiveDistance+ModeratelyActiveDistance+LightActiveDistance) AS TotalActivity, (VeryActiveMinutes+FairlyActiveMinutes+LightlyActiveMinutes)/60 AS TotalTimeActive
FROM daily_userData
WHERE TotalActivity <> 0 AND TotalTimeActive <> 0 LIMIT 10')
## Id Date Calories TotalSteps TotalDistance TrackerDistance
## 1 1503960366 4/12/2016 1985 13162 8.50 8.50
## 2 1503960366 4/13/2016 1797 10735 6.97 6.97
## 3 1503960366 4/15/2016 1745 9762 6.28 6.28
## 4 1503960366 4/16/2016 1863 12669 8.16 8.16
## 5 1503960366 4/17/2016 1728 9705 6.48 6.48
## 6 1503960366 4/19/2016 2035 15506 9.88 9.88
## 7 1503960366 4/20/2016 1786 10544 6.68 6.68
## 8 1503960366 4/21/2016 1775 9819 6.34 6.34
## 9 1503960366 4/23/2016 1949 14371 9.04 9.04
## 10 1503960366 4/24/2016 1788 10039 6.41 6.41
## LoggedActivitiesDistance TotalActivity TotalTimeActive
## 1 0 8.49 6.100000
## 2 0 6.97 4.283333
## 3 0 6.23 4.533333
## 4 0 8.16 4.450000
## 5 0 6.48 3.700000
## 6 0 9.88 5.750000
## 7 0 6.68 4.083333
## 8 0 6.34 3.966667
## 9 0 9.04 5.400000
## 10 0 6.41 4.700000
sqldf('SELECT Id,Date,Calories, TotalSleepRecords, TotalMinutesAsleep, TotalTimeInBed, TotalSteps, TotalDistance, TrackerDistance, LoggedActivitiesDistance, (VeryActiveDistance+ModeratelyActiveDistance+LightActiveDistance) AS TotalActivity, SedentaryActiveDistance, (VeryActiveMinutes+FairlyActiveMinutes+LightlyActiveMinutes)/60 AS TimeActive, SedentaryMinutes/60 AS TimeSedentary
FROM daily_userData LIMIT 10')
## Id Date Calories TotalSleepRecords TotalMinutesAsleep
## 1 1503960366 4/12/2016 1985 1 327
## 2 1503960366 4/13/2016 1797 2 384
## 3 1503960366 4/15/2016 1745 1 412
## 4 1503960366 4/16/2016 1863 2 340
## 5 1503960366 4/17/2016 1728 1 700
## 6 1503960366 4/19/2016 2035 1 304
## 7 1503960366 4/20/2016 1786 1 360
## 8 1503960366 4/21/2016 1775 1 325
## 9 1503960366 4/23/2016 1949 1 361
## 10 1503960366 4/24/2016 1788 1 430
## TotalTimeInBed TotalSteps TotalDistance TrackerDistance
## 1 346 13162 8.50 8.50
## 2 407 10735 6.97 6.97
## 3 442 9762 6.28 6.28
## 4 367 12669 8.16 8.16
## 5 712 9705 6.48 6.48
## 6 320 15506 9.88 9.88
## 7 377 10544 6.68 6.68
## 8 364 9819 6.34 6.34
## 9 384 14371 9.04 9.04
## 10 449 10039 6.41 6.41
## LoggedActivitiesDistance TotalActivity SedentaryActiveDistance TimeActive
## 1 0 8.49 0 6.100000
## 2 0 6.97 0 4.283333
## 3 0 6.23 0 4.533333
## 4 0 8.16 0 4.450000
## 5 0 6.48 0 3.700000
## 6 0 9.88 0 5.750000
## 7 0 6.68 0 4.083333
## 8 0 6.34 0 3.966667
## 9 0 9.04 0 5.400000
## 10 0 6.41 0 4.700000
## TimeSedentary
## 1 12.133333
## 2 12.933333
## 3 12.100000
## 4 12.883333
## 5 8.983333
## 6 12.916667
## 7 13.633333
## 8 13.966667
## 9 12.200000
## 10 11.816667
daily_userData <- daily_userData %>% mutate(TimeVeryActive = VeryActiveMinutes/60,
TimeFairlyActive = FairlyActiveMinutes/60,
TimeLightlyActive = LightlyActiveMinutes/60,
TimeSedentary = SedentaryMinutes/60)
Activitytime.gathered <- daily_userData %>% gather(key = 'variables', value = 'ActivityLevel',-Calories,-Id, -Date, -TotalSleepRecords, -TotalMinutesAsleep, -TotalTimeInBed, -TotalSteps, -TotalDistance, -TrackerDistance, -LoggedActivitiesDistance, -VeryActiveDistance, -ModeratelyActiveDistance,-LightActiveDistance, -SedentaryActiveDistance, -VeryActiveMinutes, -FairlyActiveMinutes, -LightlyActiveMinutes, -SedentaryMinutes)
head(Activitytime.gathered)
## Id Date TotalSteps TotalDistance TrackerDistance
## 1 1503960366 4/12/2016 13162 8.50 8.50
## 2 1503960366 4/13/2016 10735 6.97 6.97
## 3 1503960366 4/15/2016 9762 6.28 6.28
## 4 1503960366 4/16/2016 12669 8.16 8.16
## 5 1503960366 4/17/2016 9705 6.48 6.48
## 6 1503960366 4/19/2016 15506 9.88 9.88
## LoggedActivitiesDistance VeryActiveDistance ModeratelyActiveDistance
## 1 0 1.88 0.55
## 2 0 1.57 0.69
## 3 0 2.14 1.26
## 4 0 2.71 0.41
## 5 0 3.19 0.78
## 6 0 3.53 1.32
## LightActiveDistance SedentaryActiveDistance VeryActiveMinutes
## 1 6.06 0 25
## 2 4.71 0 21
## 3 2.83 0 29
## 4 5.04 0 36
## 5 2.51 0 38
## 6 5.03 0 50
## FairlyActiveMinutes LightlyActiveMinutes SedentaryMinutes Calories
## 1 13 328 728 1985
## 2 19 217 776 1797
## 3 34 209 726 1745
## 4 10 221 773 1863
## 5 20 164 539 1728
## 6 31 264 775 2035
## TotalSleepRecords TotalMinutesAsleep TotalTimeInBed variables
## 1 1 327 346 TimeVeryActive
## 2 2 384 407 TimeVeryActive
## 3 1 412 442 TimeVeryActive
## 4 2 340 367 TimeVeryActive
## 5 1 700 712 TimeVeryActive
## 6 1 304 320 TimeVeryActive
## ActivityLevel
## 1 0.4166667
## 2 0.3500000
## 3 0.4833333
## 4 0.6000000
## 5 0.6333333
## 6 0.8333333
Activitytime.gathered <- Activitytime.gathered %>%
mutate(across(variables, factor, levels = c('TimeVeryActive', 'TimeFairlyActive', 'TimeLightlyActive', 'TimeSedentary')))
## Warning: There was 1 warning in `mutate()`.
## i In argument: `across(...)`.
## Caused by warning:
## ! The `...` argument of `across()` is deprecated as of dplyr 1.1.0.
## Supply arguments directly to `.fns` through an anonymous function instead.
##
## # Previously
## across(a:b, mean, na.rm = TRUE)
##
## # Now
## across(a:b, \(x) mean(x, na.rm = TRUE))
ggplot(Activitytime.gathered, aes(x=ActivityLevel, y=Calories, color = ActivityLevel)) + geom_point() + stat_smooth(method=lm) + facet_wrap(~variables, scale = 'free') + scale_color_gradient(low = "blue", high = "red") + labs(title="Relationship Between Activity Level and Calories Burned")
## `geom_smooth()` using formula 'y ~ x'
ggplot(daily_userData, aes(x=TotalSteps, y=Calories, color = TotalSteps)) + geom_point() + stat_smooth(method=lm) + scale_color_gradient(low = "red", high = "purple") + labs(title ='Relationship between Total Steps and Calories Burned') + theme(legend.position="none")
## `geom_smooth()` using formula 'y ~ x'
daily_userData <- daily_userData %>% mutate(TotalSleep = TotalMinutesAsleep/60)
Sleepquality.gathered <- daily_userData %>% gather(key = 'variables', value = 'ActivityLevel', -TotalSleep, -Id, -Date, -Calories, -TotalSleepRecords, -TotalMinutesAsleep, -TotalTimeInBed, -TotalSteps, -TotalDistance, -TrackerDistance, -LoggedActivitiesDistance, -VeryActiveDistance, -ModeratelyActiveDistance,-LightActiveDistance, -SedentaryActiveDistance, -VeryActiveMinutes, -FairlyActiveMinutes, -LightlyActiveMinutes, -SedentaryMinutes)
head(Sleepquality.gathered)
## Id Date TotalSteps TotalDistance TrackerDistance
## 1 1503960366 4/12/2016 13162 8.50 8.50
## 2 1503960366 4/13/2016 10735 6.97 6.97
## 3 1503960366 4/15/2016 9762 6.28 6.28
## 4 1503960366 4/16/2016 12669 8.16 8.16
## 5 1503960366 4/17/2016 9705 6.48 6.48
## 6 1503960366 4/19/2016 15506 9.88 9.88
## LoggedActivitiesDistance VeryActiveDistance ModeratelyActiveDistance
## 1 0 1.88 0.55
## 2 0 1.57 0.69
## 3 0 2.14 1.26
## 4 0 2.71 0.41
## 5 0 3.19 0.78
## 6 0 3.53 1.32
## LightActiveDistance SedentaryActiveDistance VeryActiveMinutes
## 1 6.06 0 25
## 2 4.71 0 21
## 3 2.83 0 29
## 4 5.04 0 36
## 5 2.51 0 38
## 6 5.03 0 50
## FairlyActiveMinutes LightlyActiveMinutes SedentaryMinutes Calories
## 1 13 328 728 1985
## 2 19 217 776 1797
## 3 34 209 726 1745
## 4 10 221 773 1863
## 5 20 164 539 1728
## 6 31 264 775 2035
## TotalSleepRecords TotalMinutesAsleep TotalTimeInBed TotalSleep variables
## 1 1 327 346 5.450000 TimeVeryActive
## 2 2 384 407 6.400000 TimeVeryActive
## 3 1 412 442 6.866667 TimeVeryActive
## 4 2 340 367 5.666667 TimeVeryActive
## 5 1 700 712 11.666667 TimeVeryActive
## 6 1 304 320 5.066667 TimeVeryActive
## ActivityLevel
## 1 0.4166667
## 2 0.3500000
## 3 0.4833333
## 4 0.6000000
## 5 0.6333333
## 6 0.8333333
Sleepquality.gathered <- Sleepquality.gathered %>%
mutate(across(variables, factor, levels = c('TimeVeryActive', 'TimeFairlyActive', 'TimeLightlyActive', 'TimeSedentary')))
ggplot(Sleepquality.gathered, aes(x=ActivityLevel, y= TotalSleep, color = TotalSleep)) + geom_point() + stat_smooth(method=lm) + facet_wrap(~variables, scale = 'free') + scale_color_gradient(low = "black", high = "yellow") + labs(title="Relationship Between Total Sleep and Activity Level")
## `geom_smooth()` using formula 'y ~ x'
From the analysis there are clear trends that gave interesting insights that could be applicable to the marketing strategy for Bellabeat in the global smart device market.
These insights were:
There is a clear relation between higher physical activity and more calories burned.
More activity is linked with higher quality of sleep.
Recommend users to set goals for total amount of steps taken in a day. Enable notifications to encourage users to meet the goal and if they achieve it, to set a higher goal when they feel ready.
Include a function in the Bellabeat app to alert users to try and get at least 30 minutes of moderate activity if data shows that they are often sedentary throughout the day.
Have the app notify users with encouraging and motivating messages, especially if they have been sedentary for a extended period of time to motivate activity.
Enhance the app to inform users of disruptive sleeping habits, such as irregular sleep schedules or not enough activity.