正在加载图片...
326 DAVID MCSHERRY Theorem 2: The size of the recovery set for an unsuccessful query Q can never be more than Ci/2 where k=Latts(Q)] and Lk/2] is the integer part of k/2 Proof: First we note that for any distinct sub-queries 01, 22E rs(Q), atts(@1)and atts(22) are incomparable subsets of atts(o). For example, if atts(Q1)C atts( @2) then @1 would be a sub-query of @2 But this cannot be the case by Lemma 2. That the size of the recov ery set for Q can never exceed the specified limit immediately follows from Sperner's(1928)proof that the maximum number of incompara ble subsets of any set of size m is mCIm/2) n For a query involving 4 attributes, like the example query in Fig- l, it follows from Theorem 2 that the size of the recovery set can never exceed*C2=6. Table I shows the corresponding limits for que ries ranging in length from 1 to 10. It can be seen that recovery-set sizes are always within reasonable limits for queries involving less than five or six attributes. The same cannot be said for the recovery-set sizes that are possible, at least in theory, for longer queries. However, we now present empirical evidence which suggests that recovery sets tend to be much smaller in practice that their maximum possible sizes. Our evaluation focuses on full-length queries on the Travel case library(www.ai-cbr.org),astandardbenchmarkcontainingthedescri ptions of more than 1000 holidays in terms of eight attributes such as price, region, and holiday type. Our experimental method is based on a leave-one-out approach in which we temporarily remove each case from the case library, present its description as a query to our recom mender system, and record the size of the recovery set for each unsuc cessful query. In order for a query to be successful, there must be at least one case with exactly matching values for all attributes except price. In addition, the price for a matching case must not exceed the value specified in the query; that is, the specified price is treated as a preferred maximum. Given the length of the queries used in our exper iment, it is perhaps not surprising that less than 3% of them were successful Table 1. Maximum recovery-set sizes for queries of increasing length Query length: 1 2 3 4 5 6 Maximum size:123610203570126252326 DAVID MCSHERRY Theorem 2: The size of the recovery set for an unsuccessful query Q can never be more than kCk/2 where k = atts(Q) and k/2 is the integer part of k/2. Proof: First we note that for any distinct sub-queries Q1, Q2∈ rs(Q), atts(Q1) and atts(Q2) are incomparable subsets of atts(Q). For example, if atts(Q1)⊂ atts(Q2) then Q1 would be a sub-query of Q2. But this cannot be the case by Lemma 2. That the size of the recov￾ery set for Q can never exceed the specified limit immediately follows from Sperner’s (1928) proof that the maximum number of incompara￾ble subsets of any set of size m is mCm/2. For a query involving 4 attributes, like the example query in Fig￾ure 1, it follows from Theorem 2 that the size of the recovery set can never exceed 4C2 =6. Table 1 shows the corresponding limits for que￾ries ranging in length from 1 to 10. It can be seen that recovery-set sizes are always within reasonable limits for queries involving less than five or six attributes. The same cannot be said for the recovery-set sizes that are possible, at least in theory, for longer queries. However, we now present empirical evidence which suggests that recovery sets tend to be much smaller in practice that their maximum possible sizes. Our evaluation focuses on full-length queries on the Travel case library (www.ai-cbr.org), a standard benchmark containing the descri￾ptions of more than 1000 holidays in terms of eight attributes such as price, region, and holiday type. Our experimental method is based on a leave-one-out approach in which we temporarily remove each case from the case library, present its description as a query to our recom￾mender system, and record the size of the recovery set for each unsuc￾cessful query. In order for a query to be successful, there must be at least one case with exactly matching values for all attributes except price. In addition, the price for a matching case must not exceed the value specified in the query; that is, the specified price is treated as a preferred maximum. Given the length of the queries used in our exper￾iment, it is perhaps not surprising that less than 3% of them were successful. Table 1. Maximum recovery-set sizes for queries of increasing length Query length: 1 2 3 4 5 6 7 8 9 10 Maximum size: 1 2 3 6 10 20 35 70 126 252
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有