Title: | Calculations for All Common Subsequences |
---|---|
Description: | Implements several string comparison algorithms, including calACS (count all common subsequences), lenACS (calculate the lengths of all common subsequences), and lenLCS (calculate the length of the longest common subsequence). Some algorithms differentiate between the more strict definition of subsequence, where a common subsequence cannot be separated by any other items, from its looser counterpart, where a common subsequence can be interrupted by other items. This difference is shown in the suffix of the algorithm (-Strict vs -Loose). For example, q-w is a common subsequence of q-w-e-r and q-e-w-r on the looser definition, but not on the more strict definition. calACSLoose Algorithm from Wang, H. All common subsequences (2007) IJCAI International Joint Conference on Artificial Intelligence, pp. 635-640. |
Authors: | Alan Gu |
Maintainer: | Alan Gu <[email protected]> |
License: | GPL |
Version: | 2.2.2 |
Built: | 2025-02-28 05:31:51 UTC |
Source: | https://github.com/cran/calACS |
Calculate the total number of all common subsequences between a string and a vector/list of strings. Subsequences can be interrupted by items, i.e. q-w is considered a subsequence of q-e-w-r
calACSLoose(vecA, listB, sep = "-", dropFirstItem = FALSE)
calACSLoose(vecA, listB, sep = "-", dropFirstItem = FALSE)
vecA |
The single string |
listB |
The vector/list of 1 or more strings |
sep |
Delimiter separating each items in a sequence |
dropFirstItem |
Boolean. If true, the first item in each sequence is excluded from counting all subsequences |
The total number of all common subsequences as an integer in a vector
calACSLoose("q-w-e-r", c("q-e-w-r","q-r-e-w"), "-") calACSLoose("itemToBeDropped-q-w-e-r", "itemToBeDroped-q-e-w-r", "-", dropFirstItem=TRUE)
calACSLoose("q-w-e-r", c("q-e-w-r","q-r-e-w"), "-") calACSLoose("itemToBeDropped-q-w-e-r", "itemToBeDroped-q-e-w-r", "-", dropFirstItem=TRUE)
Count the total number of all common subsequences between a string and a vector/list of strings. Subsequences cannot be interrupted by any item, i.e. q-w is not considered a subsequence of q-e-w-r due to the interrupting 'e'
calACSStrict(vecA, listB, sep = "-", dropFirstItem = FALSE, ignoreLenOneSubseq = FALSE, ignoreLenZeroSubseq = FALSE)
calACSStrict(vecA, listB, sep = "-", dropFirstItem = FALSE, ignoreLenOneSubseq = FALSE, ignoreLenZeroSubseq = FALSE)
vecA |
The single string |
listB |
The vector/list of 1 or more strings |
sep |
Delimiter separating each items in a sequence |
dropFirstItem |
Boolean. If true, the first item in each sequence is excluded from counting all subsequences |
ignoreLenOneSubseq |
Boolean. If true, all length one subequences are not counted as common subsequences |
ignoreLenZeroSubseq |
Boolean. If true, the length zero subsequence (empty set) is not counted as a common subsequence |
The total number of all common subsequences as an integer in a vector
calACSStrict("q-w-e-r", c("q-e-w-r","q-r-e-w"), "-") calACSStrict("itemToBeDropped-q-w-e-r", "itemToBeDroped-q-e-w-r", "-", dropFirstItem=TRUE)
calACSStrict("q-w-e-r", c("q-e-w-r","q-r-e-w"), "-") calACSStrict("itemToBeDropped-q-w-e-r", "itemToBeDroped-q-e-w-r", "-", dropFirstItem=TRUE)
Calculate the length of each common subsequences between a string and a vector/list of strings. Subsequences cannot be interrupted by any item, i.e. q-w is not considered a subsequence of q-e-w-r due to the interrupting 'e'
lenACSStrict(vecA, listB, sep = "-", dropFirstItem = FALSE, ignoreLenOneSubseq = FALSE)
lenACSStrict(vecA, listB, sep = "-", dropFirstItem = FALSE, ignoreLenOneSubseq = FALSE)
vecA |
The single string |
listB |
The vector/list of 1 or more strings |
sep |
Delimiter separating each items in a sequence |
dropFirstItem |
Boolean. If true, the first item in each sequence is excluded from counting all subsequences |
ignoreLenOneSubseq |
Boolean. If true, all length one subequences are not counted as common subsequences |
A list of vectors of the length of each common subsequence
lenACSStrict("q-w-e-r", c("q-e-w-r","q-r-e-w","q-w-r-e"), "-") lenACSStrict("itemToBeDropped-q-w-e-r", "itemToBeDroped-q-e-w-r", "-", dropFirstItem=TRUE)
lenACSStrict("q-w-e-r", c("q-e-w-r","q-r-e-w","q-w-r-e"), "-") lenACSStrict("itemToBeDropped-q-w-e-r", "itemToBeDroped-q-e-w-r", "-", dropFirstItem=TRUE)
Calculate the length of the longest common subsequence (KCS) between a string and a vector/list of strings. Subsequences cannot be interrupted by any item, i.e. q-w is not considered a subsequence of q-e-w-r due to the interrupting 'e'
lenLCSStrict(vecA, listB, sep = "-", dropFirstItem = FALSE)
lenLCSStrict(vecA, listB, sep = "-", dropFirstItem = FALSE)
vecA |
The single string |
listB |
The vector/list of 1 or more strings |
sep |
Delimiter separating each items in a sequence |
dropFirstItem |
Boolean. If true, the first item in each sequence is excluded from counting all subsequences |
A list of vectors of the length of each common subsequence
lenACSStrict("q-w-e-r", c("q-e-w-r","q-r-e-w","q-w-r-e"), "-") lenACSStrict("itemToBeDropped-q-w-e-r", "itemToBeDroped-q-e-w-r", "-", dropFirstItem=TRUE)
lenACSStrict("q-w-e-r", c("q-e-w-r","q-r-e-w","q-w-r-e"), "-") lenACSStrict("itemToBeDropped-q-w-e-r", "itemToBeDroped-q-e-w-r", "-", dropFirstItem=TRUE)
The function takes in multiple vectors of any length, and returns the one with the longest length. The tieBreaker variable controls if the first or the last of the longest vectors gets returned in case there are multiple
longestVec(..., tieBreaker = "last")
longestVec(..., tieBreaker = "last")
... |
vectors of any length |
tieBreaker |
decides if the first or the last longest vector gets returned if there are multiple longest vectors. Can be either 'first' or 'last'. Default to 'last'. |
longestVec(1:5, c('a','b'))
longestVec(1:5, c('a','b'))