Title: | Perform Set Operations on Vectors, Automatically Generating All n-Wise Comparisons, and Create Markdown Output |
---|---|
Description: | Automates set operations (i.e., comparisons of overlap) between multiple vectors. It also contains a function for automating reporting in 'RMarkdown', by generating markdown output for easy analysis, as well as an 'RMarkdown' template for use with 'RStudio'. |
Authors: | Jacob Gerard Levernier [aut, cre] (Designed and authored the package source code and documentation. Roles: author, creator, designer, engineer, programmer), Heather Gaile Wacha [aut] (Provided intellectual overview and consultation during development for use with medieval cartographic datasets. Roles: conceptor, consultant, data contributor) |
Maintainer: | Jacob Gerard Levernier <[email protected]> |
License: | BSD_3_clause + file LICENSE |
Version: | 0.1.0 |
Built: | 2024-11-16 05:06:27 UTC |
Source: | https://github.com/jglev/r-veccompare |
The veccompare package contains functions for automating set operations. Given a named list of 5 vectors, for example, veccompare can calculate all 2-, 3-, 4-, and 5-way comparisons between those vectors, recording information for each comparison about the set "union" (combined elements), "intersection" (overlap / shared elements), and compliments (which elements are unique to each vector involved in the comparison).
The veccompare package contains functions for automating set operations (i.e., comparisons of overlap) between multiple vectors.
The package also contains a function for automating reporting in RMarkdown, by generating markdown output for easy analysis, as well as an RMarkdown template for use with RStudio.
The primary function from veccompare is compare.vectors
. Complementarily, compare.vectors.and.return.text.analysis.of.overlap
will call compare.vectors
and generate Markdown-style output from it (for example, for use within an RMarkdown file).
An RMarkdown template illustrating several of veccompare's features can be used from within RStudio by clicking File -> New File -> R Markdown... -> From Template -> Veccompare Overlap Report
.
veccompare also provides a function, summarize.two.way.comparisons.percentage.overlap
, that can create correlation-plot-style images and network graphs for all two-way comparisons between vectors. This function is also demonstrated in the Veccompare Overlap Report
described above.
Maintainer: Jacob Gerard Levernier [email protected] (Designed and authored the package source code and documentation. Roles: author, creator, designer, engineer, programmer)
Authors:
Heather Gaile Wacha [email protected] (Provided intellectual overview and consultation during development for use with medieval cartographic datasets. Roles: conceptor, consultant, data contributor)
Useful links:
Report bugs at https://github.com/publicus/r-veccompare/issues
Compare all combinations of vectors using set operations
compare.vectors(named_list_of_vectors_to_compare, degrees_of_comparison_to_include = NULL, draw_venn_diagrams = FALSE, vector_colors_for_venn_diagrams = NULL, save_venn_diagram_files = FALSE, location_for_venn_diagram_files = "", prefix_for_venn_diagram_files = "", saved_venn_diagram_resolution_ppi = 300, saved_venn_diagram_dimension_units = "in", saved_venn_diagram_width = 8, saved_venn_diagram_height = 6, viewport_npc_width_height_for_images = 1, suppress_messages = FALSE)
compare.vectors(named_list_of_vectors_to_compare, degrees_of_comparison_to_include = NULL, draw_venn_diagrams = FALSE, vector_colors_for_venn_diagrams = NULL, save_venn_diagram_files = FALSE, location_for_venn_diagram_files = "", prefix_for_venn_diagram_files = "", saved_venn_diagram_resolution_ppi = 300, saved_venn_diagram_dimension_units = "in", saved_venn_diagram_width = 8, saved_venn_diagram_height = 6, viewport_npc_width_height_for_images = 1, suppress_messages = FALSE)
named_list_of_vectors_to_compare |
A named list of vectors to compare (see, for example, |
degrees_of_comparison_to_include |
A number or vector of numbers of which degrees of comparison to print (for example, 'c(2, 5)' would print only 2- and 5-way vector comparisons). |
draw_venn_diagrams |
A logical (TRUE/FALSE) indicator whether to draw Venn diagrams for all 2- through 5-way comparisons of vectors. |
vector_colors_for_venn_diagrams |
An optional vector of color names for Venn diagrams (if |
save_venn_diagram_files |
A logical (TRUE/FALSE) indicator whether to save Venn diagrams as PNG files. |
location_for_venn_diagram_files |
An optional string giving a directory into which to save Venn diagram PNG files (if |
prefix_for_venn_diagram_files |
An optional string giving a prefix to prepend to saved Venn diagram PNG files (if |
saved_venn_diagram_resolution_ppi |
An optional number giving a resolution (PPI) for saved Venn diagrams (if |
saved_venn_diagram_dimension_units |
An optional string giving units for specifying |
saved_venn_diagram_width |
The width (in |
saved_venn_diagram_height |
The height (in |
viewport_npc_width_height_for_images |
The scale at which to print an image. If the image is cut off at its edges, for example, this can be set lower than 1.0. |
suppress_messages |
A logical (TRUE/FALSE) indicator whether to suppress messages. Even if this is |
A list, with one object for each comparison of vectors. The list contains the following elements:
The vector names involved in the comparison.
A vector of all (deduplicated) items involved in the comparison, across all of the vectors.
A vector of the deduplicated elements that occurred in all of the compared vectors.
This element will have a sub-element named for each vector being compared (i.e., for each of the names in $elements_involved
). The (deduplicated) items that were unique to that vector (i.e., not overlapping with any other vector in the comparison).
If save_venn_diagram_files
is TRUE
, and the comparison is of 2 through 5 vectors, a Venn diagram object produced using the VennDiagram package. This diagram can be rendered using render.venn.diagram
.
To compile this list object into a Markdown report, use compare.vectors.and.return.text.analysis.of.overlap
. For an example of this usage, see the Veccompare Overlap Report
RMarkdown template for RStudio that is installed as part of the veccompare package.
example <- veccompare::compare.vectors(veccompare::example.vectors.list) # To extract similar elements across list items: veccompare::extract.compared.vectors( example, elements_of_output = "elements_involved" ) # To extract all comparisons that involve "vector_a": veccompare::extract.compared.vectors( example, vector_names = "vector_a" ) # To find all comparisons that were about "vector_a" and "vector_c": veccompare::extract.compared.vectors( example, vector_names = c("vector_a", "vector_c"), only_match_vector_names = TRUE ) # To get all elements that did a two-way comparison: veccompare::extract.compared.vectors( example, degrees_of_comparison = 2 )
example <- veccompare::compare.vectors(veccompare::example.vectors.list) # To extract similar elements across list items: veccompare::extract.compared.vectors( example, elements_of_output = "elements_involved" ) # To extract all comparisons that involve "vector_a": veccompare::extract.compared.vectors( example, vector_names = "vector_a" ) # To find all comparisons that were about "vector_a" and "vector_c": veccompare::extract.compared.vectors( example, vector_names = c("vector_a", "vector_c"), only_match_vector_names = TRUE ) # To get all elements that did a two-way comparison: veccompare::extract.compared.vectors( example, degrees_of_comparison = 2 )
compare.vectors
This function is a wrapper for compare.vectors
. It creates a Markdown report of all degrees of set comparisons between a named list of vectors.
compare.vectors.and.return.text.analysis.of.overlap(named_list_of_vectors_to_compare, degrees_of_comparison_to_include = NULL, cat_immediately = FALSE, draw_venn_diagrams = FALSE, viewport_npc_width_height_for_images = 1, vector_colors_for_venn_diagrams = NULL, save_venn_diagram_files = FALSE, location_for_venn_diagram_files = "", prefix_for_venn_diagram_files = "", saved_venn_diagram_resolution_ppi = 300, saved_venn_diagram_dimension_units = "in", saved_venn_diagram_width = 8, saved_venn_diagram_height = 6, base_heading_level_to_use = 1)
compare.vectors.and.return.text.analysis.of.overlap(named_list_of_vectors_to_compare, degrees_of_comparison_to_include = NULL, cat_immediately = FALSE, draw_venn_diagrams = FALSE, viewport_npc_width_height_for_images = 1, vector_colors_for_venn_diagrams = NULL, save_venn_diagram_files = FALSE, location_for_venn_diagram_files = "", prefix_for_venn_diagram_files = "", saved_venn_diagram_resolution_ppi = 300, saved_venn_diagram_dimension_units = "in", saved_venn_diagram_width = 8, saved_venn_diagram_height = 6, base_heading_level_to_use = 1)
named_list_of_vectors_to_compare |
A named list of vectors to compare (see, for example, |
degrees_of_comparison_to_include |
A number or vector of numbers of which degrees of comparison to print (for example, 'c(2, 5)' would print only 2- and 5-way vector comparisons). |
cat_immediately |
A logical (TRUE/FALSE) indicator whether to immediately print the output, as in an RMarkdown document. |
draw_venn_diagrams |
A logical (TRUE/FALSE) indicator whether to draw Venn diagrams for all 2- through 5-way comparisons of vectors. |
viewport_npc_width_height_for_images |
The scale at which to print an image. If the image is cut off at its edges, for example, this can be set lower than 1.0. |
vector_colors_for_venn_diagrams |
An optional vector of color names for Venn diagrams (if |
save_venn_diagram_files |
A logical (TRUE/FALSE) indicator whether to save Venn diagrams as PNG files. |
location_for_venn_diagram_files |
An optional string giving a directory into which to save Venn diagram PNG files (if |
prefix_for_venn_diagram_files |
An optional string giving a prefix to prepend to saved Venn diagram PNG files (if |
saved_venn_diagram_resolution_ppi |
An optional number giving a resolution (PPI) for saved Venn diagrams (if |
saved_venn_diagram_dimension_units |
An optional string giving units for specifying |
saved_venn_diagram_width |
The width (in |
saved_venn_diagram_height |
The height (in |
base_heading_level_to_use |
An integer indicating the highest-level heading to print. Defaults to |
Use of this function is illustrated with the Veccompare Overlap Report
RMarkdown template for RStudio that is installed as part of the veccompare package.
A string of Markdown (and Venn diagrams, if draw_venn_diagrams
is TRUE
).
If cat_immediately
is TRUE
, nothing is returned by the function; rather, the output Markdown is printed immediately (for example, as part of a Knitted RMarkdown document, or to the console).
If cat_immediately
is FALSE
, the output can be saved to an object (as in the example below). This object can then be printed using cat()
.
NOTE WELL: If cat_immediately
is FALSE
, the output should be saved to an object. If it is not, R will give an error message when printing to the console, because of unescaped special characters (which work correctly when cat()
is used).
example <- compare.vectors.and.return.text.analysis.of.overlap( veccompare::example.vectors.list, cat_immediately = FALSE, draw_venn_diagrams = FALSE ) cat(example)
example <- compare.vectors.and.return.text.analysis.of.overlap( veccompare::example.vectors.list, cat_immediately = FALSE, draw_venn_diagrams = FALSE ) cat(example)
An example dataset containing several named vectors, which can be compared to one another for overlaps, unique elements, etc.
example.vectors.list
example.vectors.list
A list of named vectors.
compare.vectors
Straightforwardly extract particular elements from the output of compare.vectors
.
extract.compared.vectors(output_from_compare.vectors, vector_names = NULL, only_match_vector_names = FALSE, degrees_of_comparison = NULL, elements_of_output = NULL)
extract.compared.vectors(output_from_compare.vectors, vector_names = NULL, only_match_vector_names = FALSE, degrees_of_comparison = NULL, elements_of_output = NULL)
output_from_compare.vectors |
The list output of |
vector_names |
An optional vector of names to extract from the named list ( |
only_match_vector_names |
A logical (TRUE/FALSE) indicator whether to match only |
degrees_of_comparison |
An optional number of vector of numbers indicating which degrees of comparison to return (for example, 2 will return only two-way comparisons from |
elements_of_output |
An optional vector of element names from |
A winnowed version of output_from_compare.vectors
. Depending on arguments, either a list, a vector, or a string.
example <- veccompare::compare.vectors(veccompare::example.vectors.list) # To extract similar elements across list items: veccompare::extract.compared.vectors( example, elements_of_output = "elements_involved" ) # To extract all comparisons that involve "vector_a": veccompare::extract.compared.vectors( example, vector_names = "vector_a" ) # To find all comparisons that were about "vector_a" and "vector_c": veccompare::extract.compared.vectors( example, vector_names = c("vector_a", "vector_c"), only_match_vector_names = TRUE ) # To get all elements that did a two-way comparison: veccompare::extract.compared.vectors( example, degrees_of_comparison = 2 ) # A more complex / specific example: extract.compared.vectors( example, vector_names = c("vector_a", "vector_c"), only_match_vector_names = FALSE, degrees_of_comparison = c(2, 3), elements_of_output = "elements_involved" )
example <- veccompare::compare.vectors(veccompare::example.vectors.list) # To extract similar elements across list items: veccompare::extract.compared.vectors( example, elements_of_output = "elements_involved" ) # To extract all comparisons that involve "vector_a": veccompare::extract.compared.vectors( example, vector_names = "vector_a" ) # To find all comparisons that were about "vector_a" and "vector_c": veccompare::extract.compared.vectors( example, vector_names = c("vector_a", "vector_c"), only_match_vector_names = TRUE ) # To get all elements that did a two-way comparison: veccompare::extract.compared.vectors( example, degrees_of_comparison = 2 ) # A more complex / specific example: extract.compared.vectors( example, vector_names = c("vector_a", "vector_c"), only_match_vector_names = FALSE, degrees_of_comparison = c(2, 3), elements_of_output = "elements_involved" )
An function to generate a given number of random colors.
generate.random.colors(number_of_colors_to_get)
generate.random.colors(number_of_colors_to_get)
number_of_colors_to_get |
The number of colors to generate. |
A vector of R color names.
generate.random.colors(5)
generate.random.colors(5)
A wrapper function for printing a grid
-based image using grid::grid.draw()
.
render.venn.diagram(venn_diagram_created_with_VennDiagram_package, viewport_npc_width_height_for_images = 1)
render.venn.diagram(venn_diagram_created_with_VennDiagram_package, viewport_npc_width_height_for_images = 1)
venn_diagram_created_with_VennDiagram_package |
A grid-based diagram object. For example, a Venn diagram previously generated using |
viewport_npc_width_height_for_images |
The scale at which to print an image. If the image is cut off at its edges, for example, this can be set lower than 1.0. |
The function will not return a value; rather, it will print the image.
# Create comparisons across 5 vectors, specifically creating all 4-way venn diagrams from them: example <- veccompare::compare.vectors( veccompare::example.vectors.list[1:5], draw_venn_diagrams = TRUE, suppress_messages = TRUE, degrees_of_comparison_to_include = 4 ) # Get the first 4-way comparison that includes a diagram: diagram <- veccompare::extract.compared.vectors( example, degrees_of_comparison = 4, elements_of_output = "venn_diagram" )[[1]]$venn_diagram # Print the diagram: veccompare::render.venn.diagram( diagram, viewport_npc_width_height_for_images = .7 # Scale the image down to 70%, # in case it otherwise gets cut off at the margins. )
# Create comparisons across 5 vectors, specifically creating all 4-way venn diagrams from them: example <- veccompare::compare.vectors( veccompare::example.vectors.list[1:5], draw_venn_diagrams = TRUE, suppress_messages = TRUE, degrees_of_comparison_to_include = 4 ) # Get the first 4-way comparison that includes a diagram: diagram <- veccompare::extract.compared.vectors( example, degrees_of_comparison = 4, elements_of_output = "venn_diagram" )[[1]]$venn_diagram # Print the diagram: veccompare::render.venn.diagram( diagram, viewport_npc_width_height_for_images = .7 # Scale the image down to 70%, # in case it otherwise gets cut off at the margins. )
Summarize Percentage Overlap for Two-Way Comparisons between Vectors
summarize.two.way.comparisons.percentage.overlap(named_list_of_vectors_to_compare, output_type = "table", melt_table = FALSE, network_graph_minimum = 0, margins_for_plot = NULL)
summarize.two.way.comparisons.percentage.overlap(named_list_of_vectors_to_compare, output_type = "table", melt_table = FALSE, network_graph_minimum = 0, margins_for_plot = NULL)
named_list_of_vectors_to_compare |
A named list of vectors to compare (see, for example, |
output_type |
Either |
melt_table |
A logical (TRUE/FALSE) indicator, when |
network_graph_minimum |
|
margins_for_plot |
The margins for image output (if |
Either a matrix (if output
is "table"
), or an image (if output
is "matrix_plot"
or "network_graph"
). If an image is printed, nothing is returned by the function; rather, the output is printed immediately.
If output
is "table"
and melt_table
is FALSE
, the output will be a matrix with nrow
and ncol
both equal to the number of vectors in named_list_of_vectors_to_compare
. This table shows the decimal percentage overlap (e.g., "0.20" = 20%) between each combination of vectors. This table is intended to be read with row names first, in this form: "[row title] overlaps with [column title] [cell value] percent."
If output
is "table"
and melt_table
is TRUE
, the output will be a melted
data.frame with three columns: Vector_Name
, Overlaps_With
, and Decimal_Percentage
.
summarize.two.way.comparisons.percentage.overlap(veccompare::example.vectors.list) summarize.two.way.comparisons.percentage.overlap( veccompare::example.vectors.list, output_type = "table", melt_table = TRUE ) summarize.two.way.comparisons.percentage.overlap( veccompare::example.vectors.list, output_type = "matrix_plot" # You can also choose "network_graph" )
summarize.two.way.comparisons.percentage.overlap(veccompare::example.vectors.list) summarize.two.way.comparisons.percentage.overlap( veccompare::example.vectors.list, output_type = "table", melt_table = TRUE ) summarize.two.way.comparisons.percentage.overlap( veccompare::example.vectors.list, output_type = "matrix_plot" # You can also choose "network_graph" )
Print a vector with commas and a final "and".
vector.print.with.and(vector_to_print, string_to_return_if_vector_is_empty = "", use_oxford_comma = TRUE)
vector.print.with.and(vector_to_print, string_to_return_if_vector_is_empty = "", use_oxford_comma = TRUE)
vector_to_print |
A vector of strings (or elements able to be coerced into strings) to print. |
string_to_return_if_vector_is_empty |
If |
use_oxford_comma |
A logical (TRUE/FALSE) value indicating whether to use an Oxford comma ("One, two, and three" vs. "One, two and three"). |
A single string that concatenates the input, separating with commas and adding "and" before the final item.
vector.print.with.and(c("One", "Two", "Three", "Four")) vector.print.with.and(c("One", "Two", "Three", "Four"), use_oxford_comma = FALSE) vector.print.with.and(c("One", "Two")) vector.print.with.and(c("One")) vector.print.with.and(c(), string_to_return_if_vector_is_empty = "(None)") # Outputs "(None)" vector.print.with.and(c(""), string_to_return_if_vector_is_empty = "(None)") # Outputs ""
vector.print.with.and(c("One", "Two", "Three", "Four")) vector.print.with.and(c("One", "Two", "Three", "Four"), use_oxford_comma = FALSE) vector.print.with.and(c("One", "Two")) vector.print.with.and(c("One")) vector.print.with.and(c(), string_to_return_if_vector_is_empty = "(None)") # Outputs "(None)" vector.print.with.and(c(""), string_to_return_if_vector_is_empty = "(None)") # Outputs ""
This function is a wrapper for setdiff
. It makes it easier to remember which vector is being subtracted from the other, by displaying an explicit message.
which.of.one.set.is.not.in.another(set_1, set_2, suppress_messages = FALSE)
which.of.one.set.is.not.in.another(set_1, set_2, suppress_messages = FALSE)
set_1 |
A vector to be subtracted from. |
set_2 |
A vector to subtract from |
suppress_messages |
A logical (TRUE/FALSE) indicator whether to suppress messages. |
A vector of the values of set_1
that are not present in set_2
. Put differently, a vector resulting from subtracting set_2
from set_1
.
veccompare::which.of.one.set.is.not.in.another( veccompare::example.vectors.list$vector_a, veccompare::example.vectors.list$vector_b ) veccompare::which.of.one.set.is.not.in.another( veccompare::example.vectors.list$vector_b, veccompare::example.vectors.list$vector_a )
veccompare::which.of.one.set.is.not.in.another( veccompare::example.vectors.list$vector_a, veccompare::example.vectors.list$vector_b ) veccompare::which.of.one.set.is.not.in.another( veccompare::example.vectors.list$vector_b, veccompare::example.vectors.list$vector_a )