Inter_Intra_Variation/LoadData.Rmd at master · MattNolanLab/Inter_Intra_Variation · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
title: "LoadData"
author: "Matt Nolan"
date: "02/05/2018"
output: html_document
---
```{r setup_LoadData, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# Load data {-}

To load and preprocess data used for other analyses.


Import stellate cell data.
```{r import data, message = FALSE}
fname.sc <- "Data/datatable_sc.txt"
data.import.sc <- read_tsv(fname.sc)

# Strip out rows from data where locations are unknown (are NaN)
data.sc <- data.import.sc %>% drop_na(dvloc)

# Convert dvloc from microns to millimetres - prevents errors in model fitting large dv values
data.sc <- mutate(data.sc, dvlocmm = dvloc/1000)

# Add the number of observations for each mouse as a column
counts <- count(data.sc, id)
data.sc$counts <- counts$n[match(data.sc$id, counts$id)]

# Make sure id, hemi, housing, mlpos, expr, patchdir are factors
col_facs <- c("id", "hemi", "housing", "mlpos", "expr", "patchdir")
data.sc[col_facs] <- lapply(data.sc[col_facs], factor)
```

Import wfs1 cell data.
```{r, message = FALSE}
fname.wfs <- "Data/datatable_cal.txt"
data.import.wfs <- read_tsv(fname.wfs)
data.wfs <- data.import.wfs %>% drop_na(dvloc)
data.wfs <- mutate(data.wfs, dvlocmm = dvloc/1000)
```


Total number of cells recorded, and number in each environment:
```{r}
length(data.sc$housing)
count(data.sc, housing)
```

Number of stellate cells recorded per animal:
```{r}
counts <- data.sc %>% count(id, age, housing, dvloc)
summary(counts)
summary(filter(counts, housing == "Standard"))
summary(filter(counts, housing == "Large"))
```

Number of wfs1 cells recorded per animal:
```{r}
count(data.wfs, id, classification)
```


Add new columns containing copula transformed data. Make a new data frame _TP to contain the transformed data. This data frame has a column 'dvlocmm' with absolute locations and a column 'dvlocmm1' with transformed locations.
```{r}
trans.properties <- apply(apply(data.sc %>% dplyr::select(vm:fi), 2, edf), 2, qnorm )
trans.dvlocmm <- apply(apply(data.sc %>% dplyr::select(dvlocmm), 2, edf), 2, qnorm )
data.sc_TP <- bind_cols(as.data.frame(trans.properties), dplyr::select(data.sc,-(vm:fi)), as.data.frame(trans.dvlocmm)) %>%
  as_tibble()
```


Data is in a 'flat' format:
```{r}
head(data.sc)
```


Reformat data for use with map and other tidyverse functions. data.sc_R contains the experimentally measured proprties. data.sc_TP_r contains the transformed properties.
```{r}
data.sc_r <- data.sc %>%
  dplyr::select(vm:fi, dvlocmm, id, housing, id, mlpos, hemi, age, housing, expr, patchdir, rectime, counts) %>%
  gather("property", "value", vm:fi) %>%
  group_by(property) %>%
  nest %>%
  ungroup

data.sc_TP_r <- data.sc_TP %>%
  dplyr::select(vm:fi, dvlocmm, dvlocmm1, id, housing, id, mlpos, hemi, age, housing, expr, patchdir, rectime, counts) %>%
  gather("property", "value", vm:fi) %>%
  group_by(property) %>%
  nest %>%
  ungroup

```

Now the data is organised as a frame containing nested frames for each measurement:
```{r}
head(data.sc_r)
```

The nested frame for membrane potential (vm) looks like this:
```{r}
head(filter(data.sc_r, property == "vm")$data)
```