jesse_the_k: USB jump drive pointing into my left ear (JK data in ear)
[personal profile] jesse_the_k

Solved! The expression is \{[^}]*\} thanks [personal profile] sonia!

I'm writing collaboratively with plain text files. I enclose my comments in braces (aka curly brackets). Here's what my commented text looks like:

{beginning of a comment.}
Lorem Ipsum has been the industry's {how can something that old be a "standard"?} dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it.

{I think the bulk of this paragraph should be deleted:
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.} It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.


So: sometimes comments are on their own lines, sometimes they're in the middle of a paragraph, and sometimes they enclose a paragraph.

What's the grep/regular expression pattern that deletes all content between braces and then deletes the braces themselves? (I'm on a Mac using a lovely GUI-ish utility called RegExRX, PCRE v8.42.)

⇾1

(no subject)

Date: 2020-11-30 06:03 pm (UTC)
sonia: Quilted wall-hanging (Default)
From: [personal profile] sonia
Commenting to let you know I'm taking a look, and I will edit the comment if/when I have useful results! I will try downloading that utility, since the fine implementation details tend to vary.

Here's the pattern that worked for me. Let me know how it goes for you.
\{[^}]*\}

Explanation: 
\{       Match a { 
[^}]     Match anything except a }
*        0 or more times
\}       Match a }
Edited Date: 2020-11-30 06:15 pm (UTC)
⇾3

Re: Zippity doo dah!

Date: 2020-12-01 01:37 am (UTC)
sonia: Quilted wall-hanging (Default)
From: [personal profile] sonia
Yay, I'm glad it worked! I did spare a thought for nested expressions, but that's a more complex problem.

While I like your jewelry and appreciate the offer, I will humble-brag that the first expression I typed into the app worked, so it's not like I put a lot of time in. Happy to share knowledge without a tangible reward. :-)

Also, I noticed that was a very thoughtfully formed question! All I had to do was drop your pre-supplied test data into the app.
⇾4

Re: Zippity doo dah!

Date: 2020-12-01 06:42 am (UTC)
mdlbear: blue fractal bear with text "since 2002" (Default)
From: [personal profile] mdlbear
I'm really glad this works -- it's much simpler than my sketch of a solution. (On the other hand, some of the loops can handle nested braces. I'd use Perl for that.)
⇾5

Re: Zippity doo dah!

Date: 2020-12-01 06:33 pm (UTC)
sonia: Quilted wall-hanging (Default)
From: [personal profile] sonia
Yeah, handling all the edge cases gets messy in a hurry. I've written enough parsing code to back away slowly. :-)
⇾1

(no subject)

Date: 2020-11-30 06:30 pm (UTC)
mdlbear: blue fractal bear with text "since 2002" (Default)
From: [personal profile] mdlbear
sed -E "s/\{[^}]*\}//g" foo # foo contains the text quoted in the post`


...works on the first example, but not the second because sed doesn't have any way (that I know of) to make a substitute command work across line boundaries. ISTR Perl does, using the /s suffix and assuming you've sucked the entire file into a string. Your utility might have something similar, allowing it to treat end-of-line as an ordinary character.

Otherwise you need a loop something like the following vaguely Perlish:
while () do {
   char c = getc(IN)
   if (c != '}') 
       then putc(OUT, c);
       else do { c = getc(IN); } until (c=='}');
}

Edited Date: 2020-11-30 06:33 pm (UTC)
⇾2

(no subject)

Date: 2020-11-30 07:42 pm (UTC)
mdlbear: blue fractal bear with text "since 2002" (Default)
From: [personal profile] mdlbear
It can be done more efficiently with regexes, but it's more complex. The following is vaguely perl-like but hasn't been tested.

while ($line=readline()) {
    do {
        $line =~ s/\{[^}]*\}//g; # delete all fully-balanced {...}
        if ($line =~ /\{/) {     # if there's an unmatched open brace
            $line .= readline(); # we tack on another line
        }
    } until ($line !~ /\{/)
    print $line;
    $line=readline();
}
⇾4

(no subject)

Date: 2020-12-01 05:48 pm (UTC)
mdlbear: blue fractal bear with text "since 2002" (Default)
From: [personal profile] mdlbear
Most welcome! It was fun, but I'm glad it wasn't needed.

I don't think anyone in my household needs jewelry at the moment (Colleen would be the only one, and I think her jewelry cabinet has been opened maybe twice since we moved in three years ago.

What kind of jewelry do you make?
⇾6

(no subject)

Date: 2020-12-01 11:48 pm (UTC)
mdlbear: blue fractal bear with text "since 2002" (Default)
From: [personal profile] mdlbear
Those are lovely!

Popular Tags

Subscription Filters

July 2025

S M T W T F S
  12345
6789101112
13141516171819
20212223 2425 26
2728293031  

Style Credit

Powered by Dreamwidth Studios
Page generated Sunday, July 27th, 2025 12:52 am