Random Number Generation with Parameters (2024)

Demosthenis Kasastogiannis 2024 年 8 月 17 日 9:02

Dear All,

Kindly ask for your assistant, I am trying to generate a time series with approx 500 data with the following parameters

1)values between min=0 max=1500

2) 0-100 range 50% of total data, 101-200 25% of total data etc

3) x/x+1 data should be +/-10% at 20% of data, x/x+1 should be between +/-10% - +/-20 at 15% of total data etc

Any ideas will be highly appreciated

What does this mean:

3) x/x+1 data should be +/-10% at 20% of data, x/x+1 should be between +/-10% - +/-20 at 15% of total data etc

does it say something about the ratio of consecutive terms? And what exactly does it say? I'm sorry, but that line is highly confusing.

Apologies for the poor description...

Concerning the third parameter let me provide an example.

100,120,135..... the percentage difference if 100 and 120 is 20% (100x 1,2) while the percentage difference of 120 and 135 is 12,5%. As such i would like to determine the number of % differences within the generated data series as let's say 0%-10% differences at 20% of the entire data (is 500*20% = 100 data), 10%-20% difference at 15% of the entire data (500*15% = 75 data) etc. Hope it is more clear now. As i do want to generate consecutive data with 50% and 60% differences, it will be usefull if the generated data of ranges (0-100, 101 -200, 201-300 etc) are not continuously ie 100,275,510 etc

Many thanks!!!



Amith 2024 年 8 月 17 日 11:51

Hi Demosthenis,

I understand that you want to generate random numbers based on a few rules/parameters.

The first and second conditions can be met using the `randi` function. For instance, you can generate 50% of the total data size (i.e., 500 numbers) within the range of 0-100. The next 20% of the data can be in the range of 101-200, while the remaining numbers can fall within the range of 201-1500. For example, to generate the first 100 numbers in the range of 0-100, you can use:

r = randi([0 100],1,100)

However, the third condition is unclear and somewhat confusing. It would be helpful if you could provide more details or elaborate further on it.

Hope this helps!

Demosthenis Kasastogiannis 2024 年 8 月 17 日 12:20



  • リンク



編集済み: Demosthenis Kasastogiannis 2024 年 8 月 17 日 12:34

Thank you very much Amith!! It is Highly appreciated

Concerning the third parameter let me provide an example.

100,120,135..... the percentage difference if 100 and 120 is 20% (100x 1,2) while the percentage difference of 120 and 135 is 12,5%. As such i would like to determine the number of % differences within the generated data series as let's say 0%-10% differences at 20% of the entire data (is 500*20% = 100 data), 10%-20% difference at 15% of the entire data (500*15% = 75 data) etc. Hope it is more clear now. As i do want to generate consecutive data with 50% and 60% differences, it will be usefull if the generated data of ranges (0-100, 101 -200, 201-300 etc) are not continuously ie 100,275,510 etc

Many thanks!!!


@Demosthenis Kasastogiannis,

I share @John D'Errico's question. I am not sure I understand your reply to his question. My understanding of your wishes is below. If my interpretation is wrong, please explain it again, and explain why you want what you want, because maybe that will help me understand.

My interpretation of your reply to @John D'Errico: You would like consecutive numbers in the "random" data set to differ by 0 to 10%, in 20% of cases. You would like consecutive numbers in the data set to differ by 10-20% in 15% of cases. I think you want consecutive numbers to differ by 50% to 60% in some cases, but I am not sure how often.

In your original post, you said you want

A. Approximately 500 random numbers in the range [0,1500].

B. "50% of numbers between 0 and 100, 25% of numbers between 100 and 200, etc."

C. The requirements about differences between consecutive numbers, described above.

Requirement B is met by an exponential distribution with Random Number Generation with Parameters (7).

Random numers from this distribution may, rarely, exceed 1500. We can chek for any such values and delete them if found, to comply with requirement A.

N=500; % number of initial points

mu=100/log(2); % distribution parameter


[~,ind]=find(x<=1500); % indices of elements of x<=1500

y=x(ind); % y=elements of x that are <=1500


fprintf('length(y)=%d; min=%.1f, max=%.1f.\n',M,min(y),max(y));

length(y)=500; min=0.4, max=988.9.

fprintf('0<=y<100 in %.1f %% of cases.\n',sum(y<100)*100/M)

0<=y<100 in 47.4 % of cases.

fprintf('100<=y<200 in %.1f %% of cases.\n',sum(y>=100 & y<200)*100/M)

100<=y<200 in 23.8 % of cases.

The results above indicate that vector y satisfies requirements A and B above.

For requirement C, let us first see what the distribution of percent differences between successive values looks like, if use vector y.


histogram(percentDiff, 'Normalization','probability')

grid on; ylabel('Probability')

xlabel('Percent Difference'); title('Difference Between Consecutive Values')

Random Number Generation with Parameters (8)

The plot shows that the differece btween consecutive elements is between -1000% and 0%, in approximately 50% of cases, and the difference is between 0 and +1000%, in approximately 45% of cases, and the difference exceeds +1000% in the remaining cases. It makes sense that the difference is negative half the time, for this random sequence. In fact, we know from the definition of percent change, and the fact that the distribution is non-negative, that the successive difference can never be smaller than -100%. Let us make a histogram plot with an expanded horizontal axis to learn more.



xlim([-120,100]); grid on; ylabel('Probability')

xlabel('Percent Difference'); title('Difference Between Consecutive Values')

Random Number Generation with Parameters (9)

The histogram above shows that the difference between successive elements is between -10% and +10% in about 6.0% of cases (the sum of the heights of the two central bars). (Exact values will differ when you run the code, due to randomness.) The difference is in the range -20% to -10%, or +10% to +20%, in 5.4% of cases. I think you want the difference to be between -10% and +10% in 20% of cases, and you want the difference to be in the range -20% to -10%, or +10% to +20%, in 15% of cases.

I have demonstrated how to make a vector of values that satisfies conditions A and B, and I have demonstrated how to evaluate whether condition C is satisfied.

To make a sequence w() that satisfies condition C (but maybe not conditons A and B), you could trythe equation


where a(j) is a random number in the range 0.9 to 1.1 in 20% of cases, and a(j) is random in the range (0.8 to 0.9 or 1.1 to 1.2) in 15% of cases, etc. I tried the sequence above. It is tricky to work with. The results are sensitive to the details of the probability distribution of the coefficients a(j). Sequences w(j) that go to 0 (when long) are observed for some a(j) distributions. Some a(j) distributions are likely to produce w(j) sequences that are very large at times.

I would not be surprised if it is impossible to satisfy conditions A, B, and C simultaneously.

Demosthenis Kasastogiannis 約11時間 前



  • リンク



Hi William, many thanks for your detailed answer and time spent.

My effort is to generate, from a large data series (40.000 data) a smaller data series in terms of data (500 data), engine's power profile having the same characteristics as the initial profile. My description in terms of a) ranges ie 0-100, 101-200... are the energy intervals on which the eninge works and b) the % changes, represent the increases - decreases on energy demand from the engine on particular time intervals compared to time t(x)/ t(x+1). My difficulty is also in satisfying the C condition as the first one can be also described as a probability of occurance, for each power level, which seems to work

pmf = [All the probabilities of the power levels occuring on the initial data series (40.000) for each value];

population = 1:1500;

sample_size = 500;

random_number = randsample(population,sample_size,true,pmf

I am not sure of whether my approach for solving this problem is the correct one.

William Rose 約2時間 前



  • リンク



編集済み: William Rose 約2時間 前

MATLAB Online で開く

@Demosthenis Kasastogiannis,

[Edit: Add ")" in my comments, where I forgot to include it.]

Can you upload a mat file with the 40000 power levels?

You want a vector y, of length 500, whose elements are engine power levels at specific times. The power levels are integers in the range [1:1500].* The values of y are to be selected randomly but should follow certain probabilities. The probabilities of selecting each integer in [1:1500] are obtained from the probability of that integer occurring in the sequence x, which has length 40000. (I assume the sequence x has only integers, and min(x)=1 and max(x)=1500.)

You can generate a sequence y that meets the requirements above, as follows:

% x1=40000 random integers with approximately exponential distribution.

% If I had your file of 40000 integers, I would read it in to get x.

x1=round(0.5+exprnd(100/log(2),[1,40000])); % engine power (integers)


x=x1(ind); % x=x1, without any values >1500 (engine power)

N=length(x); fprintf('Length(x)=%d, min(x)=%d, max(x)=%d.\n',...


Length(x)=39998, min(x)=1, max(x)=1472.

pop=1:1500; % population of random integers

% Next: Compute the weighting vector w, whose elements are

% the number of occurrences in x of each value in pop

w=zeros(1,length(pop)); % allocate w

for i=1:length(pop)



fprintf('w includes %d values=0.\n',sum(w==0))

w includes 603 values=0.

% Next: y=500 samples from pop using the weighting factors w

y=randsample(pop,500,true,w); % engine power

% Plot w=weighting factors and the sequence y.


subplot(211), plot(1:1500,w,'b.')

xlabel('Population value'); ylabel('Weight'); title('w'); grid on

subplot(212), plot(1:500,y,'-r.')

xlabel('Time'); ylabel('Power'); title('y=Engine Power'); grid on

Random Number Generation with Parameters (12)

A key feature of y constructed above is that its autocorrelation is basically 0, for lags greater than 0, i.e y(n) and y(n-1) have no correlation. You want the random variable z(n)=y(n)/y(n-1) to have a certain probability distribution. This is not possible if you generate y() with randsample(). See my previous comment for a suggestion for how to generate a sequence y such that z has a desired proability distribution.

*Is the lower limit for power 0 or 1? You say 1:1500 in one place, but you also say "ranges ie 0-100,101-200...".





MATLABLanguage FundamentalsData TypesTables

Help Center および File ExchangeTables についてさらに検索


  • random number generator
  • discrete distribution

