I have two questions:
1) Why is my code adding a carriage return at the beggining of the selected_line string?
2) Do you think the algorithm I'm using to return a random line from the file is good enough and won't cause any problems?
A sample file is:
line
number one
#
line number two
My code:
int main()
{
srand(time(0));
ifstream read("myfile.dat");
string line;
string selected_line;
int nlines = 0;
while(getline(read, line, '#')) {
if((rand() % ++nlines) == 0)
selected_line = line;
}
// this is adding a \n at the beggining of the string
cout << selected_line << endl;
}
Thanks in advance for your help.
EDIT: OK, what some of you suggested makes a lot of sense. The string is probably being read as "\nmystring". So I guess my question now is, how would i remove the first \n from the string?
-
Because you don't specify
\nas a delimeter. -
Your "random" selection is completely wrong. In fact, it will always select the first line:
rand() % 1is always 0.There is no way to uniformly select a random line without knowing the number of lines present.
In addition, why are you using # as a delimiter? Getline, by default, gets a line (ending with \n).
-
The newlines can appear from the second line that you print. This is because, the
getlinefunction halts on seeing the#character and resumes the next time it is called from where it left of i.e. a character past the#which as per your input file is a newline. Read the C FAQ 13.16 on effectively usingrand().One suggestion is to read the entire file in one go, store the lines in a
vectorand then output them as required.Pukku : Yep - when you have the lines in a vector, it will be easy to pick one at random. -
Because # is your delimeter, the \n that exists right after that delimeter will be the beginning of your next line, thus making the \n be in front of your line.
-
1) You're not adding a
\ntoselected_line. Instead, by specifying'#'you are simply not removing the extra\ncharacters in your file. Note that your file actually looks something like this:line\n number one\n #\n line number two\n <\pre>
So line number two is actually "\nline number two\n".
2) No. If you want to randomly select a line then you need to determine the number of lines in your file first.
Naaff : To remove whitespace from an ifstream (before you call getline), you can do something like this: while(isspace(read.peek())) read.ignore(); -
What you probably want is something like this:
std::vector<std::string> allParagraphs; std::string currentParagraph; while (std::getline(read, line)) { if (line == "#") { // modify this condition, if needed // paragraph ended, store to vector allParagraphs.push_back(currentParagraph); currentParagraph = ""; else { // paragraph continues... if (!currentParagraph.empty()) { currentParagraph += "\n"; } currentParagraph += line; } } // store the last paragraph, as well // (in case it was not terminated by #) if (!currentParagraph.empty()) { allParagraphs.push_back(currentParagraph); } // this is not extremely random, but will get you started size_t selectedIndex = rand() % allParagraphs.size(); std::string selectedParagraph = allParagraphs[selectedIndex];For better randomness, you could opt for this instead:
size_t selectedIndex = rand() / (double) (RAND_MAX + 1) * allParagraphs.size();This is because the least significant bits returned by
rand()tend to behave not so randomly at all.nmuntz : Excellent Solution! Thank you very very much! I have learned a lot from this solution that you have posted. Thanks again!Pukku : You are welcome. I hope it wasn't homework :) -
You could use the substr method of the std::string class to remove the \n after you decide which line to use:
if ( line.substr(0,1) == "\n" ) { line = line.substr(1); }As others have said, if you want to select the lines with uniform randomness, you'll need to read all the lines first and then select a line number. You could also use if (rand() % (++nlines+1)) which will select line 1 with 1/2 probability, line 2 with 1/2*1/3 probability, etc.
0 comments:
Post a Comment