Posts 最长回文子串和最长回文子序列
Post
Cancel

最长回文子串和最长回文子序列

最长回文子串与最长回文子序列

 如果某个序列有如下性质:

1
        X = {x1, x2, x3, ..., xn},  满足x1 < x2 < x3 < ... < xn, 即xi < xi+1 (0 <= i <= n-1)

这个序列就称之为上升(递增)序列。类似地,如果此序列并不是完全升序,即满足x1 <= x2 <= x3 < ... <= xn,这样的序列称之为不下降(非递减)序列

最长上升子序列(LIS, Longest Increasing Subsequence)最长不下降子序列(LNDS, Longest Not-decreasing Subsequence)问题,就是要在给定的序列中找到上升序列或不下降序列,使得序列长度最长。这两类问题极其相似,针对最长上升子序列问题LIS进行详细求解,并在相应位置给出对于最长不下降子序列LNDS求解所需要的改动。
最长上升子序列问题基本描述:给定包含NN个元素的原始序列seq,求解其最长上升子序列的长度。例子:对于序列{1, 2, 4, 5, 3, 2, 2},其最长上升子序列为{1, 2, 4, 5},长度为4。其最长不下降子序列有2个,长度均为4:{1, 2, 4, 5}{1, 2, 2, 2}

问题分析与求解

动态规划

 最长上升子序列问题,是满足最优子结构性质的(证明步骤作为遗留问题,可以清晰地看到,当分析到X中第ii个元素时,问题的求解是和其前面的元素jj (0< jj < ii)相关的)。考虑采用动态规划的方式进行求解,尝试刻画最优子结构特征来得到状态转移方程。
 定义状态dpdp[ii]表示 以序列X中第ii个元素结尾的LIS的长度
 初始状态时,以每个元素结尾,初始化LIS的长度为1,即dp[ii] = 1 (0 <= ii <= NN)。从头到尾遍历整个序列,到达序列的任意位置ii (0 <= ii <= NN)时,遍历该位置前的所有元素jj (0 <= jj < ii)。如果seq[ii] < seq[jj],那么可以将seq[jj]接在以seq[ii]为结尾的LIS后面,形成一个新的LIS;否则,当前以当前位置结束的LIS不变。由此,可以得到状态转移方程:

1
2
    dp[ii] = dp[jj] + 1;          // if seq[jj] < seq[ii]
    dp[ii] = 1                    // if seq[jj] >= seq[ii]

 对于最长上升子序列问题,常常只需要给出LIS的长度。更进一步地,尝试能够得到LIS具体的序列。需要注意的时,给定序列的LIS可能会有多个,并且多个LIS存在两种情况:

  • 以同一个元素结尾的LIS可能不止一个。给定序列{1, 2, 5, 4, 7}{1, 2, 5, 7}{1, 2, 4, 7}均是LIS,均以7为结尾元素。
  • 多个LIS以不同的元素结尾。给定序列{4, 5, 1, 2}{4, 5}{1, 2}是其两个以不同元素结尾的LIS。

 定义嵌套的vector容器vector<vector<vector<int> > > lis来记录以每个元素结尾的LIS,其中内层的vector<vector<int> >用来记录具体的LIS,由于可能有多个,所以也使用了vector容器来存储。举例说明:对于给定序列{1, 2, 5, 4, 7}的以7为结尾元素的LIS,将存储为lis[4] = { {1, 2, 5, 7}, {1, 2, 4, 7} }
 具体的求解程序展示如下,由两个循环组成,所以时间复杂度为n^2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
// Time complexity: n^2
int LIS_n_n(vector<int> &seq, int NN)
{
    /*
     * dp[ii] is the sequence with seq[ii] as its last element
     *  if seq[jj] <= seq[ii] (jj < ii):          dp[ii] = dp[jj] + 1;
     *  else:                                     dp[jj] = 1
     */
    vector<int> dp(NN, 1);
    int longest = 1;
    // Store index sets with LIS, 0 put as default one
    vector<int> index = {0};
    // lis[ii] store the LIS sequence with seq[ii] as its last element
    vector<vector<vector<int> > > lis;

    // Initialize all lis[ii] (0 <= ii < NN) with its seq[ii] firstly
    for (int ii = 0; ii < NN; ii++) {
        vector<vector<int> > tmp = { {seq[ii]} };
        lis.push_back(tmp);
    }

    for (int ii = 1; ii < NN; ii++) {
        for (int jj = 0; jj < ii; jj++) {
            if (seq[ii] > seq[jj]) {
                if (dp[ii] < dp[jj] + 1) {
                    dp[ii] = dp[jj] + 1;
                    
                    // Clear existing sequence in lis[ii], as longer one appear.
                    lis[ii].clear();        
                    vector<vector<int> > local;
                    local.assign(lis[jj].begin(), lis[jj].end());
                    for (unsigned int kk = 0; kk < local.size(); kk++) {
                        local[kk].push_back(seq[ii]);
                        lis[ii].push_back(local[kk]);
                    }
                } else if (dp[ii] == dp[jj] + 1) {
                    vector<vector<int> > local;
                    local.assign(lis[jj].begin(), lis[jj].end());
                    for (unsigned int kk = 0; kk < local.size(); kk++) {
                        local[kk].push_back(seq[ii]);
                        lis[ii].push_back(local[kk]);
                    }
                }
            }
        }
        if (longest < dp[ii]) {
            longest = dp[ii];
            index.clear();
            index.push_back(ii);
        } else if (longest == dp[ii]){
            index.push_back(ii);
        }
    }

    printf("Longest Increasing Sequences are:\n");
    for (unsigned int ii = 0; ii < index.size(); ii++) {
        int idx = index[ii];
        for (unsigned int jj = 0; jj < lis[idx].size(); jj++) {
            for (unsigned int kk = 0; kk < lis[idx][jj].size(); kk++) {
                if (kk != 0)
                    printf(" ");
                printf("%d", lis[idx][jj][kk]);
            }
            printf("\n");
        }
    }

    return longest;
}

 对于最长不下降子序列问题,只需要第25行的if (seq[ii] > seq[jj])判断改为if (seq[ii] >= seq[jj])即可。

二分法+贪心算法

 对于最长上升子序列问题,还存在一种时间复杂度为n*logn的算法。这里考虑一个贪心的想法:

1
    对于一个上升子序列,其结尾元素越小,越有利于在后面接其他的元素,上升序列变得更长的可能性就越大。  

 考虑维护一个数组tail[len]表示长度为len的上升序列中结尾元素的最小值,显然可以知道tail数组的值是非递减的。从头至尾遍历整个序列,若:

  • seq[ii] > tail[len] (tail数组最后的一个元素),将seq[ii]加入到tail的最后面,LIS的长度增加1 (len++)
  • seq[ii] <= tail[len],查找tail数组中第一个不小于(大于等于)seq[ii]的数,并用seq[ii]替换。由于tail数组的非递减性,可以使用二分法查找,将查找时间复杂度降为log(n)

 对于采用二分法+贪心算法的求解中对LIS的输出,网上查询未能找到输出所有LIS序列的方法(当LIS不止一个时),仅提供输出其中一个LIS的方法。这种方法中,在从头到尾遍历原序列seq的过程中,记录每个元素在LIS中应该的位置,之后再倒序查找原序列元素对应的LIS位置,即可得到一个LIS。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
// Time complexity: n*log(n)
int LIS_n_log_n(vector<int> &seq, int NN)
{
    /*
     * tail[ii] is the minimum value of last element in sequence which length is ii.
     * cc[ii] is the related position of element seq[ii] in LNDS
     *
     *  if seq[ii] > tail[len] :                                     tail[++len] = seq[ii];
     *  else: find first value of tail which isn't smaller than seq[ii], replace with seq[ii]      tail[jj] = seq[ii]
     *  Could also think:   from the last value of tail which is equal to seq[ii], use it to update the next one.
     */
    vector<int> tail(NN + 1);
    vector<int> cc(NN + 1);
    int len = 1;
    tail[len] = seq[0];
    cc[0] = len;

    for (int ii = 1; ii < NN; ii++) {
        if (seq[ii] > tail[len]) {
            tail[++len] = seq[ii];
            cc[ii] = len;
        } else {
            int idx = lower_bound(tail.begin() + 1, tail.begin() + len + 1, seq[ii]) - tail.begin();
            tail[idx] = seq[ii];
            cc[ii] = idx;
        }
    }

    stack<int> output;
    int jj = len;
    for (int ii = NN - 1; ii >= 0; ii--) {
        if (cc[ii] == jj) {
            output.push(seq[ii]);
            jj--;
        }
        if (0 == jj)
            break;
    }
    while (!output.empty()) {
        int top = output.top();
        printf("%d ", top);
        output.pop();
    }

    return len;
}

 对于最长不下降子序列问题,稍微修改tail数组元素的更新原则:

  • seq[ii] <= tail[len],查找tail数组中第一个大于seq[ii]的数,并用seq[ii]替换。
     将上面程序中第23行的lower_bound函数替换为upper_bound即可。

实际问题实例

 PAT练习题中1045 Favorite Color Stripe即可转化为一个求解LIS的问题。问题描述如下:

Eva is trying to make her own color stripe out of a given one. She would like to keep only her favorite colors in her favorite order by cutting off those unwanted pieces and sewing the remaining parts together to form her favorite color stripe.
It is said that a normal human eye can distinguish about less than 200 different colors, so Eva’s favorite colors are limited. However the original stripe could be very long, and Eva would like to have the remaining favorite stripe with the maximum length. So she needs your help to find her the best result.
Note that the solution might not be unique, but you only have to tell her the maximum length. For example, given a stripe of colors {2 2 4 1 5 5 6 3 1 1 5 6}. If Eva’s favorite colors are given in her favorite order as {2 3 1 5 6}, then she has 4 possible best solutions {2 2 1 1 1 5 6}, {2 2 1 5 5 5 6}, {2 2 1 5 5 6 6}, and {2 2 3 1 1 5 6}.

  • Input Specification:
    Each input file contains one test case. For each case, the first line contains a positive integer N (≤200) which is the total number of colors involved (and hence the colors are numbered from 1 to N). Then the next line starts with a positive integer M (≤200) followed by M Eva’s favorite color numbers given in her favorite order. Finally the third line starts with a positive integer L (≤ 10^4) which is the length of the given stripe, followed by L colors on the stripe. All the numbers in a line a separated by a space.

  • Output Specification:
    For each test case, simply print in a line the maximum length of Eva’s favorite stripe.

  • Sample Input:
    1
    2
    3
    
    6
    5 2 3 1 5 6
    12 2 2 4 1 5 5 6 3 1 1 5 6
    
  • Sample Output:
    1
    
    7
    

问题分析:可以先将Eva最喜欢的颜色以此进行编号,例中{2, 3, 1, 5, 6}的编号即为{0, 1, 2, 3, 4},这实际上转换为了一个次序。将颜色条中不是Eva喜欢的颜色舍去后,将颜色值用前述编号代替,例如例中的颜色条可以转换为{0, 0, 2, 3, 3, 4, 1, 2, 2, 3, 4},这便可以使用求解LNDS的方式进行求解。求解程序如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

int LNDS_n_n(vector<int> &given, int NN, vector<int> &colors)
{
    int longest = 1;
    vector<int> dp(NN, 1);

    for (int ii = 1; ii < NN; ii++) {
        for (int jj = 0; jj < ii; jj++) {
            if (given[jj] <= given[ii]) {
                dp[ii] = max(dp[ii], dp[jj] + 1);
            }
        }
        longest = max(longest, dp[ii]);
    }

    return longest;
}
int main(void)
{
    int NN = 0, MM = 0, LL = 0;
    scanf("%d", &NN);
    // Store whether a color is Eva's favorite or not, presence[ii] == 1 means color ii is valid.
    vector<int> presence(NN+1);
    // Store order of favorite color, order[ii] = 0 means color ii is most favorite one.
    vector<int> order(NN+1);
    vector<int> colors;
    scanf("%d", &MM);

    int color = 0;
    for (int ii = 0; ii < MM; ii++) {
        scanf("%d", &color);
        presence[color] = 1;
        order[color] = ii;
        colors.push_back(color);
    }
    scanf("%d", &LL);
    vector<int> given;
    for (int ii = 0; ii < LL; ii++) {
        scanf("%d", &color);
        if (1 == presence[color])
            given.push_back(order[color]);
    }
    printf("%d\n", LNDS_n_n(given, given.size(), colors));

    return 0;
}
This post is licensed under CC BY 4.0 by the author.

最长上升子序列(Longest Increasing Subsequence)

Redis源码笔记-Linux 环境下自定义的宏定义

Comments powered by Disqus.