-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathselect.Rd
executable file
·98 lines (86 loc) · 3.7 KB
/
select.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/gmql_project.R
\name{select}
\alias{select}
\alias{select,GMQLDataset-method}
\alias{select-method}
\title{Method select}
\usage{
\S4method{select}{GMQLDataset}(
.data,
metadata = NULL,
metadata_update = NULL,
all_but_meta = FALSE,
regions = NULL,
regions_update = NULL,
all_but_reg = FALSE
)
}
\arguments{
\item{.data}{GMQLDataset class object}
\item{metadata}{vector of strings made up by metadata attributes}
\item{metadata_update}{list of updating rules in the form of
key = value generating new metadata attributes and/or attribute values.
The following options are available:
\itemize{
\item{All aggregation functions already defined by AGGREGATES object}
\item{All basic mathematical operations (+, -, *, /), including parenthesis}
\item{SQRT constructor object defined by OPERATOR object}
}}
\item{all_but_meta}{logical value indicating which metadata
you want to exclude; If FALSE, only the metadata attributes specified in
\emph{metadata} argument are kept in the output of the operation; if TRUE,
the metadata are all kept except those in \emph{metadata} argument.
If \emph{metadata} input parameter is not defined \emph{all_but_meta}
is not considerd.}
\item{regions}{vector of strings made up by region attributes}
\item{regions_update}{list of updating rules in the form of
key = value generating new genomic region attributes and/or values.
The following options are available:
\itemize{
\item{All aggregation functions already defined by AGGREGATES object}
\item{All basic mathematical operations (+, -, *, /), including parenthesis}
\item{SQRT, META, NIL constructor objects defined by OPERATOR object}
}}
\item{all_but_reg}{logical value indicating which region attributes
you want to exclude; if FALSE, only the regions attributes specified in
\emph{regions} argumentare kept in the output of the operation; if TRUE,
the regions attributes are all kept except those in \emph{regions} argument.
If \emph{regions} is not defined, \emph{all_but_reg} is not considerd.}
}
\value{
GMQLDataset object. It contains the value to use as input
for the subsequent GMQLDataset method
}
\description{
Wrapper to GMQL PROJECT operator
It creates, from an existing dataset, a new dataset with all
the samples from input dataset, but keeping for each sample in the input
dataset only those metadata and/or region attributes specified.
Region coordinates and values of the remaining metadata and/or region
attributes remain equal to those in the input dataset. It allows to:
\itemize{
\item{Remove existing metadata and/or region attributes from a dataset}
\item{Update or set new metadata and/or region attributes in the result}
}
}
\examples{
## This statement initializes and runs the GMQL server for local execution
## and creation of results on disk. Then, with system.file() it defines
## the path to the folder "DATASET" in the subdirectory "example"
## of the package "RGMQL" and opens such folder as a GMQL dataset
## named "data"
init_gmql()
test_path <- system.file("example", "DATASET", package = "RGMQL")
data = read_gmql(test_path)
## This statement creates a new dataset called CTCF_NORM_SCORE by preserving
## all region attributes apart from score, and creating a new region
## attribute called new_score by dividing the existing score value of each
## region by 1000.0 and incrementing it by 100.
## It also generates, for each sample of the new dataset,
## a new metadata attribute called normalized with value 1,
## which can be used in future selections.
CTCF_NORM_SCORE = select(data, metadata_update = list(normalized = 1),
regions_update = list(new_score = (score / 1000.0) + 100),
regions = c("score"), all_but_reg = TRUE)
}